This is a Chinese article text search Engine developed using FastHTML and ElasticSearch
If you have enough memory to load the entire excel file at once, then it is strict forward by using
for i, row in df.iterrows():
... # reference to create_fake_data.py
Otherwise, please either:
- Save the file as csv and refer to How do I read a large csv file with pandas?
- Know the number of rows in your file in advance and refer to read a full excel file chunk by chunk using pandas
- Setup ElasticSearch in docker and start the container, My version is 8.15.0
- create a
.env
file and define these variables in itDEBUG
ELASTICSEARCH_PORT
ELASTICSEARCH_INDEX
pip install -r requirements.txt
python src/main.py
The tests are done on 10,000 entries, each entries' full_text section is 10,000 characters long
Please click on the links to watch the preview videos