* activate directory to gitignore
* add my custom env to gitignore, will have to change that
* add unstructured to kotaemon pyproject.toml
* add .env to gitignore
* remove .env from tracking
* make changes to the run_macos script, update readme with more detailed instructions
* remove my personal changes from gitignore
* remove line from run_macos script
* remove option for not installing miniconda for non technical users, mark docker dependency as optional
* docs: update demo URL
* gitignore changes
* merge .env.example
* revert changes to run_macos.sh
* unstructured to advanced dependencies
* add link to unstructured system dependencies
* remove api key
* fix: skip tests when unstructured pdf not installed
* chore: loosen unstructured package version in pyproject.toml
* chore: correct syntax
---------
Co-authored-by: Tadashi <tadashi@cinnamon.is>
Co-authored-by: cin-albert <albert@cinnamon.is>
* Rename AzureChatOpenAI to LCAzureChatOpenAI
* Provide vanilla ChatOpenAI and AzureChatOpenAI
* Remove the highest accuracy, lowest cost criteria
These criteria are unnecessary. The users, not pipeline creators, should choose
which LLM to use. Furthermore, it's cumbersome to input this information,
really degrades user experience.
* Remove the LLM selection in simple reasoning pipeline
* Provide a dedicated stream method to generate the output
* Return placeholder message to chat if the text is empty
* Auto create conversation when the user starts
* Add conversation rename rule check
* Fix empty name during save
* Confirm deleting conversation
* Show warning if users don't select file when upload files in the File Index
* Feedback when user uploads duplicated file
* Limit the file types
* Fix valid username
* Allow login when username with leading and trailing whitespaces
* Improve the user
* Disable admin panel for non-admnin user
* Refresh user lists after creating/deleting users
* Auto logging in
* Clear admin information upon signing out
* Fix unable to receive uploaded filename that include special characters, like !@#$%^&*().pdf
* Set upload validation for FileIndex
* Improve user management UI/UIX
* Show extraction error when indexing file
* Return selected user -1 when signing out
* Fix default supported file types in file index
* Validate changing password
* Allow the selector to contain mulitple gradio components
* A more tolerable placeholder screen
* Allow chat suggestion box
* Increase concurrency limit
* Make adobe loader optional
* Use BaseReasoning
---------
Co-authored-by: trducng <trungduc1992@gmail.com>
* serve local model in a different process from the app
---------
Co-authored-by: albert <albert@cinnamon.is>
Co-authored-by: trducng <trungduc1992@gmail.com>
* feat: Add installers for linux, windows, and macos
* docs: Update README
* pre-commit fix styles
* Update installers and README
* Remove env vars check and fix paths
* Update installers:
* Remove start.py and move install and launch part back to .sh/.bat
* Add conda deactivate
* Make messages more informative
* Improve kotaemon based on insights from projects (#147)
- Include static files in the package.
- More reliable information panel. Faster & not breaking randomly.
- Add directory upload.
- Enable zip file to upload.
- Allow setting endpoint for the OCR reader using environment variable.
* feat: Add installers for linux, windows, and macos
* docs: Update README
* pre-commit fix styles
* Update installers and README
* Remove env vars check and fix paths
* Update installers:
* Remove start.py and move install and launch part back to .sh/.bat
* Add conda deactivate
* Make messages more informative
* Make macOS installer runable and improve Windows, Linux installers
* Minor fix macos commands
* installation should pause before exit
* Update Windows installer: add a new label to exit function with error
* put install_dir to .gitignore
* chore: Add comments to clarify the 'end' labels
---------
Co-authored-by: Duc Nguyen (john) <trungduc1992@gmail.com>
Co-authored-by: ian <ian@cinnamon.is>
* add test case for Chroma save/load
* minor name change
* add delete_collection support for chroma
* move save load to chroma
---------
Co-authored-by: Nguyen Trung Duc (john) <john@cinnamon.is>
- Use cases related to LLM call: https://cinnamon-ai.atlassian.net/browse/AUR-388?focusedCommentId=34873
- Sample usages: `test_llms_chat_models.py` and `test_llms_completion_models.py`:
```python
from kotaemon.llms.chats.openai import AzureChatOpenAI
model = AzureChatOpenAI(
openai_api_base="https://test.openai.azure.com/",
openai_api_key="some-key",
openai_api_version="2023-03-15-preview",
deployment_name="gpt35turbo",
temperature=0,
request_timeout=60,
)
output = model("hello world")
```
For the LLM-call component, I decide to wrap around Langchain's LLM models and Langchain's Chat models. And set the interface as follow:
- Completion LLM component:
```python
class CompletionLLM:
def run_raw(self, text: str) -> LLMInterface:
# Run text completion: str in -> LLMInterface out
def run_batch_raw(self, text: list[str]) -> list[LLMInterface]:
# Run text completion in batch: list[str] in -> list[LLMInterface] out
# run_document and run_batch_document just reuse run_raw and run_batch_raw, due to unclear use case
```
- Chat LLM component:
```python
class ChatLLM:
def run_raw(self, text: str) -> LLMInterface:
# Run chat completion (no chat history): str in -> LLMInterface out
def run_batch_raw(self, text: list[str]) -> list[LLMInterface]:
# Run chat completion in batch mode (no chat history): list[str] in -> list[LLMInterface] out
def run_document(self, text: list[BaseMessage]) -> LLMInterface:
# Run chat completion (with chat history): list[langchain's BaseMessage] in -> LLMInterface out
def run_batch_document(self, text: list[list[BaseMessage]]) -> list[LLMInterface]:
# Run chat completion in batch mode (with chat history): list[list[langchain's BaseMessage]] in -> list[LLMInterface] out
```
- The LLMInterface is as follow:
```python
@dataclass
class LLMInterface:
text: list[str]
completion_tokens: int = -1
total_tokens: int = -1
prompt_tokens: int = -1
logits: list[list[float]] = field(default_factory=list)
```