Progress – fyp23023site

Deadline	Tasks	Completed?
1 Oct 2023	Submit detailed project plan. Set up project website.
1 Oct 2023	Create UI for chatbot with Gradio (Python). It can accept keyboard typing & record voice. The chatbot should answer by repeating keyboard input or “recording” for audio input.
8 Oct 2023	Integrate Whisper model to perform auto speech recognition. The chat UI should print the transcribed text directly.
15 Oct 2023	Create Python scripts to convert pdf stored in a folder named “data” to embeddings. Store the embeddings into a folder named “embeddings”.
22 Oct 2023	Integrate the vector database to the chatbot. The chatbot can create connection to database at launch. Embed user input as query. Retrieve from database and print on UI.
15 Nov 2023	Integrate LLM to paraphrase and format the text output.
29 Nov 2023	Integrate Google text-to-speech API to generate voice audio to play.
8 Jan 2024	Create test cases about: 1. Inference time and word error rate (WER). (Speech-to-text) 2. Inference time. (Embedding) 3. Fine-tune confidence threshold (Embedding) 4. Create testing dataset for expected output from LLM (LLM) 5. Chatbot output, focus on non-toxic output and quality of audio generation (LLM and Google Text-to-Speech)
8-12 Jan 2024	First presentation
21 Jan 2024	Submit interim report, preliminary implementation.
Jan – Apr 2024	Migrate whole python web UI to windows application with React-Native.
9 Apr 2024	Integrate virtual avatar with animation played in different state of application. The states are idling, reading response, failed to retrieve relevant information.
15 Apr 2024	Create UAT for overall user experience: 1. Create 3 intended use case 2. Test behaviour against toxic input 3. Null input (empty string or silent audio) 4. Extreme long input or long waiting time (user do not input for a long time) 5. Correct animation played 6. Audio quality
23 Apr 2024	Submit finalized tested implementation, final report.
26 Apr 2024	Project exhibition.