This is an updated and corrected version of the data set used by Sejnowski and Rosenberg in their influential study of speech generation using a neural ...

multivariateThe problem is specified by the accompanying data file, "vowel.data". This consists of a three dimensional array: voweldata [speaker, vowel, input]. Th...

classificationThis loop sensor data was collected for the Glendale on ramp for the 101 North freeway in Los Angeles. It is close enough to the stadium to see unusual...

time-series, multivariateFor each text collection, D is the number of documents, W is the number of words in the vocabulary, and N is the total number of words in the collection...

text, clusteringEach record represents 100 points on a two-dimensional graph. When plotted in order (from 1 through 100) as the Y co-ordinate, the points will create ei...

classification, sequentialThe original data were formatted by Thorsten Joachims in the bag-of-words representation. There were 9947 features (of which 2562 are always zeros for a...

multivariate, classificationMADELON is an artificial dataset containing data points grouped in 32 clusters placed on the vertices of a five dimensional hypercube and randomly label...

multivariate, classificationUSPTO Algorithm Challenge, run by NASA-Harvard Tournament Lab and TopCoder Problem: Patent Labeling

domain-theory, classificationThe dataset (movement_libras) contains 15 classes of 24 instances each, where each class references to a hand movement type in LIBRAS. In the video pre...

multivariate, classification, clustering, sequentialDataset from 8800(10 digits x 10 repetitions x 88 speakers) time series of 13 Frequency Cepstral Coefficients (MFCCs) had taken from 44 males and 44 fem...

time-series, multivariate, classification