Index of /datasets/mnist

[ICO]NameLast modifiedSizeDescription

[PARENTDIR]Parent Directory  -  
[   ]mnist.hdf52011-01-30 14:04 11MThe MNIST Database Of Handwritten Digits
[TXT]README.rst2012-03-23 15:34 2.5K 
[   ]MD5SUMS2011-01-30 14:05 45  

.. _datadb_mnist:

LeCun et al. (1999): The MNIST Dataset Of Handwritten Digits (Images)

The MNIST_ dataset of handwritten digits, available from this page, has a
training set of 60,000 examples, and a test set of 10,000 examples. It is a
subset of a larger set available from NIST.  The digits have been
size-normalized and centered in a fixed-size image.  It is a good database for
people who want to try learning techniques and pattern recognition methods on
real-world data while spending minimal efforts on preprocessing and formatting.

See for more information.

.. note::

  The version that is offered here is identical to the four files distributed
  there, but has been converted into a single HDF5 file than can easily be read
  by PyMVPA.

Terms Of Use

`Yann LeCun`_ (Courant Institute, NYU) and `Corinna Cortes`_ (Google
Labs, New York) hold the copyright of MNIST_ dataset, which is a
derivative work from original NIST datasets.  MNIST_ dataset is made
available under the terms of the `Creative Commons Attribution-Share
Alike 3.0`_ license.

.. _MNIST:
.. _Creative Commons Attribution-Share Alike 3.0:
.. _Yann LeCun:
.. _Corinna Cortes:


A single hdf5 file containing entire MNIST_ dataset is available from


* HDF5 access facility.
* *PyMVPA 0.5* (or later) provides the `h5load()` function (utilizes H5PY_ package).

.. _H5PY:


  >>> from mvpa2.suite import *
  >>> filepath = os.path.join(pymvpa_datadbroot, 'mnist', "mnist.hdf5")
  >>> datasets = h5load(filepath)
  >>> train = datasets['train']
  >>> test = datasets['test']
  >>> print train
  <Dataset: 60000x784@uint8, <sa: labels>>
  >>> print test
  <Dataset: 10000x784@uint8, <sa: labels>>
  >>> # assign a mapper able to recreate 28x28 pixel image arrays
  >>> test.a.mapper = FlattenMapper(shape=(28, 28))
  >>> test.mapper.reverse(test).shape
  (10000, 28, 28)


:ref:`LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998) <LBB+98>`.
Gradient-based learning applied to document recognition.
Proceedings of the IEEE, 86, 2278--2324.