数字识别

Digit Recognizer

github

Data Introduction--数据说明

原文点击此处

The data files train.csv and test.csv contain gray-scale images of hand-drawn digits, from zero through nine.

数据文件train.csv及test.csv包含从0到9几个手写数字的灰度图像。

Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255, inclusive.

每张图片长宽各28像素,共784个像素点。每个像素点均关联一个像素值,像素值表明其明暗程度,数值越高标明像素点越暗。像素值取值区间[0,255]中的整数。

The training data set, (train.csv), has 785 columns. The first column, called "label", is the digit that was drawn by the user. The rest of the columns contain the pixel-values of the associated image.

训练数据(train.csv)包含785列。第一列为“标签”,代表实际手写数字.其余列包含所有该图片的像素值数据。

Each pixel column in the training set has a name like pixelx, where x is an integer between 0 and 783, inclusive. To locate this pixel on the image, suppose that we have decomposed x as x = i * 28 + j, where i and j are integers between 0 and 27, inclusive. Then pixelx is located on row i and column j of a 28 x 28 matrix, (indexing by zero).*

训练集中的每列名字为“pixelx”,其中字母“x”代表一个位于区间[0,783]的整数。使用如下方式定位一个像素在图像中的位置,将整数x分解为x=i*28+j,其中i和j取区间[0,27]的整数,则“pixelx”像素位于28X28矩阵的i行和j列(索引起始于0)。

For example, pixel31 indicates the pixel that is in the fourth column from the left, and the second row from the top, as in the ascii-diagram below.

例如,“pixel31”表明该值为“ascii-diagram”从上到下第二行和从左到右第四列的像素值。

Visually, if we omit the "pixel" prefix, the pixels make up the image like this:

具体的,如果我们忽略“pixel”前缀,则像素值如下图一样组成一个图片:

000 001 002 003 ... 026 027
028 029 030 031 ... 054 055
056 057 058 059 ... 082 083
………
728 729 730 731 ... 754 755
756 757 758 759 ... 782 783

The test data set, (test.csv), is the same as the training set, except that it does not contain the "label" column.

测试数据集(text.csv)除了不包含“标签”列二外与训练数据集具有同样的格式。

Your submission file should be in the following format: For each of the 28000 images in the test set, output a single line containing the ImageId and the digit you predict. For example, if you predict that the first image is of a 3, the second image is of a 7, and the third image is of a 8, then your submission file would look like:

你的提交文件需要满足如下的格式:对于28000张测试图片的每张输出一行,该行包括图片ID“ImageId”和预测数字。例如,如果预测的第一幅图数字为3,第二幅图数字为7,第三幅图数字为8,则提交文件如下所示:

ImageId,Label
1,3
2,7
3,8
(27997 more lines)

The evaluation metric for this contest is the categorization accuracy, or the proportion of test images that are correctly classified. For example, a categorization accuracy of 0.97 indicates that you have correctly classified all but 3% of the images.

此比赛的评价标准为分类的准确性,或者说是正确分类的图片的比例。例如,分类准确度为0.97表明成功分类了除了3%的图片以外的所有图片。

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • **2014真题Directions:Read the following text. Choose the be...
    又是夜半惊坐起阅读 10,127评论 0 23
  • 第一次带沐沐出去游泳&洗澡,拍下来一张照片。回到家翻出来越看越像小地主🤓🤓🤓 沐沐喜欢水 回家我就迫不及待的买回了...
    沐沐麻麻bluecrystal阅读 273评论 0 0
  • 世界挺大的呢 可是它又却很小 不然我怎么能在千千万万的人群中 一下就抓住你了呢 可是你一回头 却告诉我 你认错人了...
    安稳呀阅读 83评论 0 1
  • 文【刘瑞丽】 美丽人生路, 相约每一天。 2017.7.28
    玫瑰花的梦阅读 236评论 2 1
  • 我郑重其事的写东西时,总是要在外面的环境里才写的下去。待在家里,我的思绪就跟挤牙膏一样,那过于的安静,总让我难以集...
    鹿小路阅读 1,021评论 2 4