Progress

DeadlineTasksCompleted?
1 Oct 2023Submit detailed project plan.
Set up project website.
1 Oct 2023Create UI for chatbot with Gradio (Python).
It can accept keyboard typing & record voice.
The chatbot should answer by repeating keyboard input or “recording” for audio input.
8 Oct 2023Integrate Whisper model to perform auto speech recognition. The chat UI should print the transcribed text directly.
15 Oct 2023Create Python scripts to convert pdf stored in a folder named “data” to embeddings. Store the embeddings into a folder named “embeddings”.
22 Oct 2023Integrate the vector database to the chatbot.
The chatbot can create connection to database at launch. Embed user input as query. Retrieve from database and print on UI.
15 Nov 2023Integrate LLM to paraphrase and format the text output.
29 Nov 2023Integrate Google text-to-speech API to generate voice audio to play.
8 Jan 2024Create test cases about:
1. Inference time and word error rate (WER). (Speech-to-text)
2. Inference time. (Embedding)
3. Fine-tune confidence threshold (Embedding)
4. Create testing dataset for expected output from LLM (LLM)
5. Chatbot output, focus on non-toxic output and quality of audio generation (LLM and Google Text-to-Speech)
8-12 Jan 2024First presentation
21 Jan 2024Submit interim report, preliminary implementation.
Jan – Apr 2024Migrate whole python web UI to windows application with React-Native.
9 Apr 2024Integrate virtual avatar with animation played in different state of application. The states are idling, reading response, failed to retrieve relevant information.
15 Apr 2024Create UAT for overall user experience:
1. Create 3 intended use case
2. Test behaviour against toxic input
3. Null input (empty string or silent audio)
4. Extreme long input or long waiting time (user do not input for a long time)
5. Correct animation played
6. Audio quality
23 Apr 2024Submit finalized tested implementation, final report.
26 Apr 2024Project exhibition.