Natural Language Processing Pipeline

What is the NLP pipeline?

NLP pipeline is a series of steps which are to be followed while building an end-to-end NLP software

Data Acquisition

Collecting data allows you to capture a record of past events so that we can use data analysis to find recurring patterns.

Text preparation

In natural language processing, text preprocessing is the practice of cleaning and preparing text data. NLTK and re are common Python libraries used to handle many text preprocessing tasks.

Text Clean up

Clean text is human language rearranged into a format that machine models can understand. Text cleaning can be performed using simple Python code that eliminates stopwords, removes Unicode words, and simplifies complex words to their root form.

Feature Engineering

Feature engineering is the process of selecting, manipulating, and transforming raw data into features that can be used in supervised learning. In order to make machine learning work well on new tasks, it might be necessary to design and train better features

Modelling

We use Machine/ deep learning algorithms for building models

Deployment

This is the final step where the model is deployed on Cloud or any other hosting service

Jeevan Henry Dsouza's team blog