Stanford POS tagger Tutorial | Stanford’s Part of Speech Label Demo
Introduction
Standford CoreNLP library let you tag the words in your string i.e. for each word, the “tagger” gets whether it’s a noun, a verb ..etc. and then assigns the result to the word. For example:
“Karma of humans is AI”
will be output as
Karma /NN of /IN humans /NNS is /VBZ AI /NNP
Prerequisite
Prerequisite
- Java
- Maven
Steps to Follow
Here are steps for using Stanford POSTagger in your Java project.
- Create a new project.
- Create a new folder called “taggers”.
- Download basic English Stanford Tagger from here
- Extract the zip file and Open the extracted folder.
- Copy all content of extracted foler and paste in taggers folder
Generating Tokens
Here is the code to tag a sentence “Karma of humans is AI“.
MaxentTagger tagger = new MaxentTagger("/Users/admin/LearningSourceControl/CoreNLP/taggers/models/english-left3words-distsim.tagger");List<List> sentences = MaxentTagger.tokenizeText(new StringReader("Karma of humans is AI")); for (Listsentence : sentences) { ListtSentence = tagger.tagSentence(sentence); System.out.println(SentenceUtils.listToString(tSentence, false));}
Complete Code
package com.interviewBubble.pos;import java.io.StringReader;import java.util.List;import edu.stanford.nlp.ling.SentenceUtils;import edu.stanford.nlp.ling.TaggedWord;import edu.stanford.nlp.ling.HasWord;import edu.stanford.nlp.tagger.maxent.MaxentTagger;public class TaggerDemo { private TaggerDemo() {} public static void main(String[] args) throws Exception { MaxentTagger tagger = new MaxentTagger("/Users/admin/LearningSourceControl/CoreNLP/taggers/models/english-left3words-distsim.tagger"); List<List
OUTPUT
0 [main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from /Users/admin/LearningSourceControl/CoreNLP/taggers/models/english-left3words-distsim.tagger ... done [0.7 sec].
Karma/NN of/IN humans/NNS is/VBZ AI/NNP
POM.xml
POM.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation=”http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd”>
<modelVersion>4.0.0</modelVersion>
<groupId>com.interviewBubble</groupId>
<artifactId>CoreNLP</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<name>CoreNLP</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<stanford.corenlp.version>3.9.1</stanford.corenlp.version>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!– Stanford dependecies –>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>${stanford.corenlp.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<classifier>models</classifier>
<version>${stanford.corenlp.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.25</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.25</version>
</dependency>
</dependencies>
</project>
Log4j.properties
Log4j.properties
# Set root logger level to DEBUG and its only appender to A1.
log4j.rootLogger=DEBUG, A1
# A1 is set to be a ConsoleAppender.
log4j.appender.A1=org.apache.log4j.ConsoleAppender
# A1 uses PatternLayout.
log4j.appender.A1.layout=org.apache.log4j.PatternLayout
log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c %x – %m%n