Commit Graph

306 Commits

Author SHA1 Message Date
ian_Cin
a8f92b3f9e post migrate cleanup 2024-03-18 23:10:20 +07:00
ian_Cin
df12dec732 Feat/local endpoint llm (#148)
* serve local model in a different process from the app
---------

Co-authored-by: albert <albert@cinnamon.is>
Co-authored-by: trducng <trungduc1992@gmail.com>
2024-03-15 16:17:33 +07:00
Duc Nguyen (john)
2950e6ed02 Improve behavior of simple reasoning (#157)
* Add base reasoning implementation

* Provide explicit async and streaming capability

* Allow refreshing the information panel
2024-03-12 13:03:38 +07:00
Duc Nguyen (john)
cb01d27d19 Fix integrating indexing and retrieval pipelines to FileIndex (#155)
* Add docs for settings
* Add mdx_truly_sane_lists to doc requirements
2024-03-10 16:41:42 +07:00
trducng
2b3571e892 Fix subscribing sign-in/out 2024-03-08 13:38:29 +07:00
Duc Nguyen (john)
4f356f7f9a Provide dedicated page for login (#153) 2024-03-08 08:06:51 +07:00
Duc Nguyen (john)
9725d60791 Create user management functionality (#152)
* Create user management page
* Remove old user creating UI
* Add username validation; admin user auto-creation
* Provide docs on user management
* Bump version
2024-03-07 14:19:37 +07:00
Duc Nguyen (john)
8a90fcfc99 Restructure index to allow it to be dynamically created by end-user (#151)
1. Introduce the concept of "collection_name" to docstore and vector store. Each collection can be viewed similarly to a table in a SQL database. It allows better organizing information within this data source.
2. Move the `Index` and `Source` tables from the application scope into the index scope. For each new index created by user, these tables should increase accordingly. So it depends on the index, rather than the app.
3. Make each index responsible for the UI components in the app.
4. Construct the File UI page.
2024-03-07 01:50:47 +07:00
Albert (Quang)
cc87aaa783 Add one-click installers for Linux, Windows, and MacOS (#146)
* feat: Add installers for linux, windows, and macos

* docs: Update README

* pre-commit fix styles

* Update installers and README

* Remove env vars check and fix paths

* Update installers:
* Remove start.py and move install and launch part back to .sh/.bat
* Add conda deactivate
* Make messages more informative

* Improve kotaemon based on insights from projects (#147)

- Include static files in the package.
- More reliable information panel. Faster & not breaking randomly.
- Add directory upload.
- Enable zip file to upload.
- Allow setting endpoint for the OCR reader using environment variable.

* feat: Add installers for linux, windows, and macos

* docs: Update README

* pre-commit fix styles

* Update installers and README

* Remove env vars check and fix paths

* Update installers:
* Remove start.py and move install and launch part back to .sh/.bat
* Add conda deactivate
* Make messages more informative

* Make macOS installer runable and improve Windows, Linux installers

* Minor fix macos commands

* installation should pause before exit

* Update Windows installer: add a new label to exit function with error

* put install_dir to .gitignore

* chore: Add comments to clarify the 'end' labels

---------

Co-authored-by: Duc Nguyen (john) <trungduc1992@gmail.com>
Co-authored-by: ian <ian@cinnamon.is>
2024-03-06 10:59:30 +07:00
Duc Nguyen (john)
033e7e05cc Improve kotaemon based on insights from projects (#147)
- Include static files in the package.
- More reliable information panel. Faster & not breaking randomly.
- Add directory upload.
- Enable zip file to upload.
- Allow setting endpoint for the OCR reader using environment variable.
2024-02-28 22:18:29 +07:00
Duc Nguyen (john)
e1cf970a3d Disable Gradio analytics and unnecessary font loading to avoid app hanging in private network (#145) 2024-02-20 22:02:28 +07:00
trducng
08cc99d8db Add docstring for database and OCR loader 2024-02-20 21:20:48 +07:00
Duc Nguyen (john)
767aaaa1ef Utilize llama.cpp for both completion and chat models (#141) 2024-02-20 18:17:48 +07:00
ian_Cin
a86c727869 add albert to git-secret (#140)
* add albert to git-secret

* update readme

* Limit llama-index version

* Langchain upgrade their wikipedia tool name

---------

Co-authored-by: trducng <trungduc1992@gmail.com>
2024-02-20 17:28:06 +07:00
trducng
89450ab661 Enable zip file upload in ktem 2024-02-20 02:59:46 +07:00
Duc Nguyen (john)
d36522129f refactor: replace llama-index based loader, to a llama-index mixin loader (#142) 2024-02-20 02:33:28 +07:00
trducng
7fc54d52e4 Improve ocr loader error message 2024-02-06 12:21:12 +07:00
trducng
1a4fd7c33f Update default settings to conform Langchain's Azure implementation 2024-02-05 18:04:36 +07:00
trducng
771f074c0e Add utf-8 encoding in Help Page for rendering on Windows 2024-02-05 16:42:40 +07:00
trducng
bff55230ba Reduce the default chunk size in the reasoning pipeline to fit LLM capability 2024-02-03 09:38:50 +07:00
trducng
107bc7580e Enable HTML upload 2024-02-02 11:37:57 +07:00
Duc Nguyen (john)
65852b7d71 Add docx + html reader (#139) 2024-01-31 19:21:30 +07:00
ian_Cin
116919b346 Update docs (#106) 2024-01-30 18:50:17 +07:00
trducng
cbe40fac99 Show retrieved but non-evidence docs. Support language changing 2024-01-29 11:16:07 +07:00
trducng
50b5d936f5 Optionally allow database migration with Alembic 2024-01-28 19:54:15 +07:00
trducng
04635b77f6 Make the database table customizable 2024-01-28 07:54:38 +07:00
trducng
6ae9634399 Enable .doc file 2024-01-27 23:45:19 +07:00
trducng
23c0331bab Enable pptx support 2024-01-27 23:08:06 +07:00
trducng
80ec214107 Fix loaders' file_path and other metadata 2024-01-27 22:52:46 +07:00
trducng
c6637ca56e Relate the retrievers to the indexer 2024-01-27 16:39:40 +07:00
trducng
9b586466ff Add the tutorial to mkdocs 2024-01-26 15:40:04 +00:00
Duc Nguyen (john)
22c646e5c4 Add documentation about adding reasoning and indexing pipelines to the application (#138) 2024-01-26 22:31:52 +07:00
trducng
757aabca4d Add app title, favicon. More natural chat 2024-01-25 22:40:32 +07:00
Duc Nguyen (john)
513e86f490 Add dedicated information panel to the UI (#137)
* Allow streaming to the chatbot and the information panel without threading
* Highlight evidence in a simple manner
2024-01-25 19:07:53 +07:00
Duc Nguyen (john)
ebc61400d8 Provide a developer mode when running ktem (#135)
Implement and utilize `on_app_created` to support the developer mode.
2024-01-23 11:46:59 +07:00
Duc Nguyen (john)
2dd531114f Make ktem official (#134)
* Move kotaemon and ktem into same folder

* Update docs

* Update CI

* Resolve mypy, isorts

* Re-allow test pdf files
2024-01-23 10:54:18 +07:00
Duc Nguyen (john)
9c5b707010 Customize application settings (#132)
* Allow customizing the base application

* Make the core llms and embeddings customizable

* Make the settings, reasoning and index customizable

* Import from langchain_openai
2024-01-21 14:36:07 +07:00
Duc Nguyen (john)
5a9d6f75be Migrate the MVP into kotaemon (#108)
- Migrate the MVP into kotaemon.
- Preliminary include the pipeline within chatbot interface.
- Organize MVP as an application.

Todo:

- Add an info panel to view the planning of agents -> Fix streaming agents' output.

Resolve: #60
Resolve: #61 
Resolve: #62
2024-01-10 15:28:09 +07:00
ian_Cin
230328c62f Best docs Cinnamon will probably ever have (#105) 2023-12-20 11:30:25 +07:00
Duc Nguyen (john)
0e30dcbb06 Create Langchain LLM converter to quickly supply it to Langchain's chain (#102)
* Create Langchain LLM converter to quickly supply it to Langchain's chain

* Clean up
2023-12-11 14:55:56 +07:00
Duc Nguyen (john)
da0ac1d69f Change template to private attribute and simplify imports (#101)
---------

Co-authored-by: ian <ian@cinnamon.is>
2023-12-08 18:10:34 +07:00
Duc Nguyen (john)
1f927d3391 Upgrade promptui to conform to Gradio V4 (#98) 2023-12-07 15:24:07 +07:00
ian_Cin
797df5a69c refractor agents (#100)
* refractor agents

* minor cosmetic, add terminal ui for cli

* pump to 0.3.4

* Add temporary path

* fix unclose files in tests

---------

Co-authored-by: trducng <trungduc1992@gmail.com>
2023-12-06 17:06:29 +07:00
Tuan Anh Nguyen Dang (Tadashi_Cin)
d9e925eb75 Add UnstructuredReader with support for various legacy files (.doc, .xls) (#99) 2023-12-05 16:19:13 +07:00
Duc Nguyen (john)
37c744b616 Add file-based document store and vector store (#96)
* Modify docstore and vectorstore objects to be reconstructable
* Simplify the file docstore
* Use the simple file docstore and vector store in MVP
2023-12-04 17:46:00 +07:00
Duc Nguyen (john)
0ce3a8832f Provide type hints for pass-through Langchain and Llama-index objects (#95) 2023-12-04 10:59:13 +07:00
Duc Nguyen (john)
e34b1e4c6d Refactor the index component and update the MVP insurance accordingly (#90)
Refactor the `kotaemon/pipelines` module to `kotaemon/indices`. Create the VectorIndex.

Note: currently I place `qa` to be inside `kotaemon/indices` since at the moment we only have `qa` in RAG. At the same time, I think `qa` can be an independent module in `kotaemon/qa`. Since this can be changed later, I still go at the 1st option for now to observe if we can change it later.
2023-11-30 18:35:07 +07:00
Nguyen Trung Duc (john)
8e3a1d193f Refactor agents and tools (#91)
* Move tools to agents

* Move agents to dedicate place

* Remove subclassing BaseAgent from BaseTool
2023-11-30 09:52:08 +07:00
ian_Cin
4256030b4f Adopt pyproject.toml (#89)
* ditching setup.py in favour of pyproject.toml; bump to 0.3.2

* bump to 0.3.3
2023-11-29 14:58:35 +07:00
ian_Cin
8e0779a22d Enforce all IO objects to be subclassed from Document (#88)
* enforce Document as IO

* Separate rerankers, splitters and extractors (#85)

* partially refractor importing

* add text to embedding outputs

---------

Co-authored-by: Nguyen Trung Duc (john) <trungduc1992@gmail.com>
2023-11-27 16:35:09 +07:00