stop words removal using arrays c# -
i have string array of stopwords , string array of input texts i.e.
string[] stopwords = file.readalllines(@"c:\stopwords.txt");
and
con.open(); sqlcommand query = con.createcommand(); query.commandtext = "select p_abstract aminer_paper pid between 1 , 500 , datalength(p_abstract) != 0"; sqldatareader reader = query.executereader(); var summary = new list<string>(); while(reader.read()) { summary.add(reader["p_abstract"].tostring()); } reader.close(); string[] input_texts = summary.toarray();
now, have use these stopwords array remove input_texts array. have used following technique not working, weird while accessing both arrays index. example, take first text @ index 0 of input_texts array i.e.
input_texts[0]
and match word strings in stopwords array i.e.
// have match indexes of stopwords[] input_texts[0] stopwords[]
then after removing stopwords
index 0 text of input_texts
array, have repeat texts in input_texts array.
any suggestions , code samples modifications highly appreciated acknowledgment.
thanks.
you can use linq this
//string[] input_text = new string[] { "ravi kumar", "ravi kumar", "ravi kumar" }; //string[] stopwords = new string[] { "ravi" }; for(int i=0;i<input_text.count();i++) { (int j = 0; j < stopwords.count(); j++) { input_text[i] = input_text[i].replace(stopwords[j]," "); } }
Comments
Post a Comment