Natural language processing (NLP) is a subfield of artificial intelligence that involves transforming or extracting useful information from natural language data. Methods include machine-learning and rule-based approaches.

learn more… | top users | synonyms (4) | nlp jobs

0
votes
0answers
7 views

Find the similar texts across the python dataframe

Suppose I have a python dataframe as follows, data['text'] abc.google.com d-2667808233512566908.ampproject.net d-27973032622323999654.ampproject.net def.google.com d-28678547673442325000.ampproject....
0
votes
0answers
8 views

What types of input should I use in this neural network?

I have a system in which users can send many chat messages, and many of them tend to be questions, like "How do I register an account?" "How do I do this?" "Where is X?" To help with this, I'm ...
0
votes
0answers
10 views

How to learn to score new documents based on a existing set of scored documents?

I have a 50 000 documents of 1000 words or more ranked between 0 and 2000. They all deal with a similar topic. I'd like to create an algorithm that can learn to score new documents. What approach do ...
0
votes
0answers
9 views

Trouble recognizing one word intents

I'm using wit to recognize different intents in a retail context. Some of them trigger (successfully) FAQ answers, other initiate a business logic. Surprisingly, I'm having a lot of trouble with the ...
0
votes
0answers
21 views

Part-of-speech without Python

I try to do tagging of a french text, but TreeTagger need Python which it is impossible to install it in my PC. In my work for security reasons it is impossible to install other programs (only R) Is ...
0
votes
1answer
12 views

MetaClass couldn't create public edu.stanford.nlp.time.TimeExpressionExtractorImpl(java.lang.String,java.util.Properties) with args [sutime, {}]

I am using the method described on the stanford CoreNLP page here. In order to run Stanford CoreNLP from the command line the following command is used : java -cp "*" -Xmx2g edu.stanford.nlp....
0
votes
0answers
7 views

How to get multiple types for a single value in Apache OpenNLP NER

I am using NER of Apache OpenNLP. I have successfully trained my custom data and obtained the types. But, I need it to return multiple types. For example, Let's consider I have 'white' as 'color' and '...
0
votes
1answer
9 views

wit.ai handling words with or w/out diacritics as synonyms

How to train the bot to recognize words with or without diacritics as synonyms, is this feature enabled? e.g. when I have a word (camera): foťák I would like to be treated as synonym to fotak. ...
0
votes
0answers
15 views

How can/Can I use NLTK to parse custom analytics questions?

I am new to NLP and want to understand if NLTK is the right choice for my requirements. I have to build a bot which can understand user input and translate it into meaningful queries. The user will ...
0
votes
0answers
25 views

Finding the Head Word in Python

I need to extract the head words of sentences (more specifically, the head words of the highest noun phrase in a sentence). I am using the Stanford CoreNLP server through py-corenlp to annotate my ...
-3
votes
0answers
16 views

Compare 2 english text corpus

I have 2 English text corpuses. once is people talking about topic "A" while other is people talking about topic "B". From a language point of view - the way people express themselves on topic "A" is ...
0
votes
1answer
14 views

NLTK MaxentClassifier train with negative cases

I am new at nltk library and I try to teach my classifier some labels with my own corpus. For this I have a file with IOB tags like this : How O do B-MYTag you I-MYTag know O , O where B-MYTag to ...
1
vote
1answer
23 views

Double Batching Tensorflow Input Data

I am implementing an convnet for token classification of string data. I need to take in string data from a TFRecord, batch shuffled, then perform some processing which expands the data, and batch ...
0
votes
3answers
33 views

Counting Avg Number of Words Per Sentence

I'm having a bit of trouble trying to count the number of words per sentence. For my case, I'm assuming sentences only end with either "!", "?", or "." I have a list that looks like this: ["Hey, "!",...
0
votes
0answers
5 views

Detect conditional tense via spacy?

is there any function/attribute built in spacy to uncover tokens in conditional tense? Or some possible detour to get to those?
0
votes
0answers
13 views

How to modify a grammar or write one for given sentence?

With a given list (9) of sentences, how can one write a grammar or modify an existing one to ensure it produces the sentences? For example, NP -> Det N, VP -> V NP, S -> either S or S... and then ...
0
votes
1answer
26 views

Why is “MacBook” an Entity, but not “laptop”?

I'm building a Twitterbot to reply to tweets about broken products. I was trying to use the IBM Watson AlchemyLanguage Entities API to extract products and product types from plain text. Sadly, it ...
1
vote
0answers
20 views

Comparing two maps to calculate precision and recall for NER

I am trying to calculate precision and recall for our Named Entity Recognizer by comparing our output to a gold set output. annotationMap is the gold set map and myMap is the output of my NER.To give ...
0
votes
0answers
16 views

Word to phonemes converter for slangs and made up words in Python

I currently use NLTK to convert words to phonemes in python. This works well for words in the library, but for slangs and made up words, NLTK doesn't work. E.g. words like "whasup" "dawg" Is there a ...
-1
votes
0answers
11 views

Natural Language Processing for Synonyms

What NLP tool can I use to automatically differentiate between synonyms? For example, when parsing a sentence "the wind is blowing," my parser should know that "wind" refers to "air blowing" and not "...
1
vote
0answers
35 views

What does ($pre =~ /\./ && $pre =~ /\p{IsAlpha}/) mean in the Moses Tokenizer?

Moses Tokenizer is the tokenizer widely used in machine translation and natural language processing experiments. There is a line of regex that checks for: if (($pre =~ /\./ && $pre =~ /\p{...
1
vote
1answer
40 views

Keras sequence classification in python

I am trying to perform sequence classification using keras in python 3. I am trying to classify sequences of words. In my data, I used word_2_vec to transform the words to a array of shape 300. My ...
0
votes
1answer
17 views

Error creating edu.stanford.nlp.time.TimeExpressionExtractorImpl

I am running the cort coreference resolution from this github repo. Using the syntax to run the system on raw input text as follows: cort-predict-raw -in *.txt \ -model model.obj \ ...
3
votes
1answer
21 views

textmining graph sentences in python

I'm trying to solve a text mining problem in python which consist on: Target: Create a graph composed of nodes(sentences) by tokenizing a paragraph into sentences, their edges would be their ...
0
votes
1answer
11 views

load pre-trained word2vec model for doc2vec

I'm using gensim to extract feature vector from a document. I've downloaded the pre-trained model from Google named GoogleNews-vectors-negative300.bin and I loaded that model using the following ...
0
votes
0answers
11 views

Pointwise Mutual Information from scratch

I want to write my own PMI (Python) code without relying on the NLTK. I know I have to use the formula log(p(x and y)/p(x)p(y)) Suppose i have a corpus C with contains N words and I am looking for ...
1
vote
1answer
30 views

how to read and write TermDocumentMatrix in r?

I made wordcloud using a csv file in R. I used TermDocumentMatrix method in the tm package. Here is my code: csvData <- read.csv("word", encoding = "UTF-8", stringsAsFactors = FALSE) Encoding(...
0
votes
0answers
15 views

Is a Natural language processing (or NLP) is a component of text mining? If it is how? [on hold]

IS a Natural language processing (or NLP) is a component of text mining? if it is how? can you help me with your answers. Thanks
1
vote
2answers
26 views

keep trailing punctuation in python nltk.word_tokenize

There's a ton available about removing punctuation, but I can't seem to find anything keeping it. If I do: from nltk import word_tokenize test_str = "Some Co Inc. Other Co L.P." word_tokenize(...
3
votes
1answer
45 views

SpaCy: how to load Google news word2vec vectors?

I've tried several methods of loading the google news word2vec vectors (https://code.google.com/archive/p/word2vec/): en_nlp = spacy.load('en',vector=False) en_nlp.vocab.load_vectors_from_bin_loc('...
0
votes
0answers
50 views

ValueError: invalid literal for int() with base 10: '_'

I am using the cort, a coreference resolution toolkit. I have installed cort using : pip install cort with python version 2.7 However, while running the command as mentioned in the documentation: ...
0
votes
0answers
11 views

Iterating the object returned by ShiftReduceParcer using Python 3

I'm trying to print the tree by parsing through the object returned by the ShiftReduceParse using nltk.parse in python 3.6. Below is the code: from nltk import CFG from nltk.parse import ...
0
votes
1answer
12 views

Segmentation and Collocation

I am looking for new ideas for two features I am implementing. 1.) Text segmentation feature: Ex: User Query: Resolved Query: ----------- ...
3
votes
1answer
50 views

English verbs processing ending with 'e'

I am implementing few string replacers, with these conversions in mind 'thou sittest' → 'you sit' 'thou walkest' → 'you walk' 'thou liest' → 'you lie' 'thou risest' → 'you rise' If I keep it naive ...
0
votes
1answer
15 views

Startegies to deal with new terms in test data set

I'm using word2vec model to build a classifier on training data set and wonder what are technics to deal with unseen terms (words) in the test data. Removing new terms doesn't seem like a best ...
0
votes
0answers
6 views

Wit AI Response Time

I have an application that makes REST calls to get response(entities and intents) from WIT AI. After doing some load testing, I was able to get 900-1000 ms on average response time from Wit. However,...
-1
votes
0answers
13 views

how to use pertrained word embeddings in tensorflow RNN model?

I want to use pre-trained word embeddings in my tensorflow model. I followed this link Using a pre-trained word embedding (word2vec or Glove) in TensorFlow What I am asking is that, what should be ...
1
vote
0answers
22 views

Word level sentence generation using keras

I am new to keras. I am trying to build word level sentence generation module using keras. I use a vocabulary size of 8000 and one-hot-vector representation of words in my corpus. This is my model ...
1
vote
1answer
39 views

Write filtered ngrams into outfile - list of lists

I extracted threegrams from a bunch of HTML files following a certain pattern. When I print them, I get a list of lists (where each line is a threegram). I would like to print it to an outfile for ...
0
votes
2answers
45 views

Natural language word (phrase) for cyc term

I am working on Natural Language Generation task and need to retrieve natural language word or phrase equivalent of a Cyc term. E.g. "#$EatingEvent" -> "eat" or "#$Coyote-Animal" -> "coyote". How can ...
-3
votes
0answers
21 views

Active voice to passive voice [on hold]

I am working on a project in c#(ASP.NET MVC web application). The main functionality of a project is to convert active voice to passive voice and passive voice to active voice. Similarly direct speech ...
0
votes
0answers
13 views

Rasa Nlu testing: TypeError: object pickle not returning list

i'm trying to test rasa from Python, and I'm getting this error, don't know what i'm doing wrong here is my metadata.json file { "intent_classifier": "./model_20170206-173042/intent_classifier....
-3
votes
0answers
9 views

I need to find dictionary for my translation apps [on hold]

I want to make translation application such as Lingoes on web vers and Dict box on mobile vers. But I don't know where to find the dictionary. if you know any information about this. Share it please..
0
votes
1answer
20 views

LUIS website hangs up at initializing

I am working with LUIS NLP service provided through website www.luis.ai. I am not able to open this website for few days, it hangs right on the account login page with message - "Please wait a few ...
0
votes
0answers
23 views

C# Stanford NLP online demo giving different output

Using Stanford Core NLP 3.7.0 to create Dependency tree in C#. Same sentence giving different output in my application and the online Parser demo Sentence: Display Prime HomePage on page load. ...
0
votes
0answers
9 views

How can we remove referential ambuiguity from an input

I have a input file which has referential ambuiguities. For example. Database is a collection of data. It is used for storing information. So here "it" is referred to Database but how to do that ...
-1
votes
0answers
27 views

How to get only significant tweets from a keyword and remove the irrelevant ones to the topic? [on hold]

I am currently working on a project wherein I have to harvest fire disaster-related tweets. When I use 'fire' or 'burning' as keywords, tweets that are not really fire disasters show up like 'I'm on ...
0
votes
2answers
30 views

how to dynamically build new json from old in javascript [duplicate]

I receive a json object with some number of quick reply elements from wit.ai, like this: "msg": "So glad to have you back. What do you want me to do? "action_id": "6fd7f2bd-db67-46d2-8742-...
0
votes
1answer
49 views

how to build json array dynamically in javascript

I receive a json object with some number of quick reply elements from wit.ai, like this: "msg": "So glad to have you back. What do you want me to do? "action_id": "6fd7f2bd-db67-46d2-8742-...
0
votes
0answers
22 views

Training SyntaxNet parser: what should be the output?

I'm trying to train the SyntaxNet parser on my own data (CoNLL) following the process in https://github.com/tensorflow/models/tree/master/syntaxnet. I have made it successfully through the first five ...