Chinese doc search

This is a Chinese article text search Engine developed using FastHTML and ElasticSearch

Loading data to ElasticSearch form excel file

If you have enough memory to load the entire excel file at once, then it is strict forward by using

   for i, row in df.iterrows():
      ... # reference to create_fake_data.py

Otherwise, please either:

Save the file as csv and refer to How do I read a large csv file with pandas?
Know the number of rows in your file in advance and refer to read a full excel file chunk by chunk using pandas

Setup ElasticSearch in docker and start the container, My version is 8.15.0
create a .env file and define these variables in it
- DEBUG
- ELASTICSEARCH_PORT
- ELASTICSEARCH_INDEX
pip install -r requirements.txt
python src/main.py

The tests are done on 10,000 entries, each entries' full_text section is 10,000 characters long

Please click on the links to watch the preview videos

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
src		src
.gitignore		.gitignore
create_fake_data.py		create_fake_data.py
readme.md		readme.md
requirements.txt		requirements.txt