nikobad

this is a set of python codes for experiments on different sentiment classification algorithms

running different configuration of lstm and dlstm

prerequisits

not all the prerquisits are needed for all experiments

tensorflow
tflearn
NLTK
sklearn

creating PKL file

use utilities/aclimdb_tflearnstyle_preprocess.py to create the pkl file. change the dataset name and path. the result would be saved in 4-PKLED/ under the name of the datasetName.pkl

use runner to run lstm and dlstm variation of datsets.

this file can run different configuration on different variations of datasets
open the file and use the section marked by configuration variations to set different configurations. possible configurations are commented in the file. you can use the http://tflearn.org/ to check the possible variations as well.
runner would save each of the model variation under the folder 5-SAVED_MODELS and will not overwrite the model. if you need to redo the training on one model remove the coresponding files from the folder
each run's result would be saved under ./tflearn_logs/ with a special directory name format:

    <classifier>'-DS'<dataset_name>'-EMBEDDING'<number_of_words_used_in_embedding>'-DROPOUT'<dropout>'-N_EPOCH'<n_epoch>'-LOSS'<loss>'-OPTIMIZER'<optimizer>

bidirectional RNN

to run different variations of use bidirectional_lstm.py file. due to some problems with tflearn running consecutive Bidirectional RNN was not possible

logistic regression on doc2vec representation of words

to run a logistic regression on the results of doc2vec processing of a dataset you should first change the format of the dataset to a doc2vec compatible format using utilities/convertFromDirToSingleFile.py. this would add .txt files in the same address
use utilities/makeDoc2VecPkl.py to create a doc2vec pkl file. this would take some time and the result would be saved in 4-PKLED as <datasetName>.d2v. you can also change the properties of doc2vec learner to get better results for word vectors.
use doc2vec-logistic-regression.py for executing logistic regression. the results would be logged in tensorboard readable style under 'doc2vecds-'<dataset>'optimizer-'<optimizer>'loss-'<loss_function>

classification using fasttext

use convertFromDirToFastTextFormat.py to convert datset dirs to a fasttext compatible format
use makeFastTextModel.py to make fasttext model. number of epochs and window size and word ngrams are customizable. this would save the fasttext model in the curret directory
use utilities/makeFastTextpklFile.py to create pkl file suited for fast text. this would save a pkl file under 4-PKLED folder.
use classification-example.sh file from fasttext for classification

classification using naive bayes

use mobayes/naiveBayes.py for executing the bayesclassifier on a dataset with the regular dir format

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
mobayes		mobayes
paper		paper
tflearn_logs		tflearn_logs
utilities		utilities
.gitignore		.gitignore
LSTMconfig.py		LSTMconfig.py
README.MD		README.MD
bidirectional_lstm.py		bidirectional_lstm.py
doc2vec-logistic-LSTM.py		doc2vec-logistic-LSTM.py
doc2vec-logistic-regression-unbalanced.py		doc2vec-logistic-regression-unbalanced.py
doc2vec-logistic-regression.py		doc2vec-logistic-regression.py
dynamic_lstm.py		dynamic_lstm.py
fasttext-test.py		fasttext-test.py
lstm.py		lstm.py
newlstm.py		newlstm.py
online_doc2vec.py		online_doc2vec.py
runner.py		runner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nikobad

running different configuration of lstm and dlstm

prerequisits

creating PKL file

use runner to run lstm and dlstm variation of datsets.

bidirectional RNN

logistic regression on doc2vec representation of words

classification using fasttext

classification using naive bayes

About

Releases

Packages

Languages

mbaniasad/2nikobad

Folders and files

Latest commit

History

Repository files navigation

nikobad

running different configuration of lstm and dlstm

prerequisits

creating PKL file

use runner to run lstm and dlstm variation of datsets.

bidirectional RNN

logistic regression on doc2vec representation of words

classification using fasttext

classification using naive bayes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages