Unsupervised Learning

Unsupervised learning – Wikipedia, the free encyclopedia.

CHAT

J
A  is this true Machine learning techniques that crunch through very large data sets to learn and gain ‘meaning’ from them tend to work on popularity and frequency bias.
A
a lot of the words there are ill defined..
A
popularity of what?
A
frequency of what?
A
bias towards what?
A
i’m also not sure you can learn ‘meaning’
A
also, not sure what ‘meaning’ is in this context
J
bias toward patterns in the data that occur frequently
A
what do you then mean by “to work on”?
A
do they “work” (as in they are useful) because they specifically “look for” patterns
A
often machine learning is called pattern recognition
]J
but is it?
A
so saying that machine learning works because it recognises patterns is reasoning in circles a little bit i guess
A
it’s what they do, not why they work
J
i’m trying to assess if using ‘deep learning’ techniques will result in outliers and freaky data being dismissed
A
dismissed how?
J
not deemed important enough to highlight. in a google search for instance
A
i’m not sure deep learning does that, don’t know much about it
A
i think you probably need some more precisely defined terms to be able to say something about these algorithms
J
it seems to be (i’m sketchy on this) whether ML techniques are used to learn about the data without the use of ontologies, the neural nets learn what they need to learn based on the patterns they find
A
you should look into supervised vs. unsupervised learning
J
okay…
A
just the two terms
J
yes – unsupervised
A
supervised is when an algorithm is trained on data that is labeled, it can learn from examples, unsupervised is when the data isn’t labeled, it finds and labels patterns on its own, but they might not correspond to somebody else’s pattern->label matching
A
i would take ‘meaning’ out, and make it more concrete, and then specify what they do (input->output), not how they work. and specifically state unsupervised learning, not ML in general
J
i get that, so in unsupervised learning, when you are clustering patterns, I want to know what relevance (?) is given to the aspects of the data that don’t fit in to any cluster. Except the cluster of non-clustering things.
J
thank you, this is v helpful
A
it’s not certain that they don’t fit in any clusters
A
dividing a plane in half still leaves two infinite planes
J
that’s what i mean about the cluster of non-clustering things. there can’t be a no-cluster.
A
clustering algorithms take dimensions of the data into account, if they don’t take a dimension into account it doesn’t matter what it is
A
what are non-clustering things?
A
you can make a no-cluster, it’s just a cluster with the name no-cluster
J
so when my algorithms have run (are running), I have sets of clusters that are repeat patterns (eyes in a series of faces say). But what happens to the face with a set of freckles in the shape of Italy that only occurs once but it really quite special. How will the system learn about the specialness of unique things?
A
i wouldn’t call them patterns, just a cluster of data points.
A
a learning algorithm can only see what you give it. you’ll have to give it a dimension for freckles and a dimension for ethnicity, perhaps skin colour or hair colour. you can then ask for points that have not many similar points, but that isn’t necessarily machine learning
A
if you want to learn about uniqueness as a phenomenon you’ll have to start describing it and then find patterns in that i guess
A
unique things aren’t special
A
per se
J
yes i see that, so i guess unsupervised learning won’t necessarily do that as that would mean labeling some bits of the data (perhaps semi-supervised learning).
J
haha
J
some of them are
A
just point out there is a value judgement in there that machine learning just isn’t concerned with
J
ok, yes – loads of things for me to think about. I’ll let you get on…