 |
 |
 |
 |
 |
 |
 |
According
to industry estimates, only 21% of the available data is present in
|
|
structured
form. Data is being generated as we speak, as we tweet, as we
|
|
send
messages on Whatsapp and in various other activities. Majority of this
|
|
data
exists in the textual form, which is highly unstructured in nature.
|
|
|
Few
notorious examples include – tweets / posts on social media, user to user
|
chat
conversations, news, blogs and articles, product or services reviews and
|
patient
records in the healthcare sector. A few more recent ones includes
|
|
chatbots
and other voice driven bots.
|
|
|
Despite
having high dimension data, the information present in it is not directly
|
accessible
unless it is processed (read and understood) manually or analyzed
|
|
by an
automated system.
|
|
|
In
order to produce significant and actionable insights from text data, it is
|
important
to get acquainted with the techniques and principles of Natural
|
|
Language
Processing (NLP).
|
|