Updated Documentation and Readme

master
Ilya P 9 years ago
parent 59a5e1586a
commit 95030af933

@ -3,12 +3,14 @@
Ambar Web API documentation
- [Files](#files)
- [Get Meta by Meta Id](#get-meta-by-meta-id)
- [Get File Source by Meta Id](#get-file-source-by-meta-id)
- [Get Parsed Text From File by Meta Id](#get-parsed-text-from-file-by-meta-id)
- [Get File Content by Secure Uri](#get-file-content-by-secure-uri)
- [Get Parsed Text by Secure Uri](#get-parsed-text-by-secure-uri)
- [Get File Meta by File Id](#get-file-meta-by-file-id)
- [Get File Source by File Id](#get-file-source-by-file-id)
- [Get Parsed Text From File by File Id](#get-parsed-text-from-file-by-file-id)
- [Download File Content by Secure Uri](#download-file-content-by-secure-uri)
- [Download Parsed Text by Secure Uri](#download-parsed-text-by-secure-uri)
- [Upload File](#upload-file)
- [Hide File](#hide-file)
- [Unhide File](#unhide-file)
- [Search](#search)
- [Search For Documents By Query](#search-for-documents-by-query)
@ -20,20 +22,35 @@ Ambar Web API documentation
- [Statistics](#statistics)
- [Get Statistics](#get-statistics)
- [Tags](#tags)
- [Delete Tag From File](#delete-tag-from-file)
- [Get Tags](#get-tags)
- [Add Tag For File](#add-tag-for-file)
- [Thumbnails](#thumbnails)
- [Get Thumbnail by Id](#get-thumbnail-by-id)
- [Add or Update Thumbnail](#add-or-update-thumbnail)
- [Users](#users)
- [Login](#login)
- [Logout](#logout)
# Files
## Get Meta by Meta Id
## Get File Meta by File Id
GET api/files/direct/:metaId/meta
GET api/files/direct/:fileId/meta
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email</p> |
| ambar-email-token | String | <p>User token</p> |
### Success Response
@ -49,12 +66,18 @@ HTTP/1.1 404 Not Found
```
File meta or content not found
```
## Get File Source by Meta Id
## Get File Source by File Id
GET api/files/direct/:metaId/source
GET api/files/direct/:fileId/source
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email</p> |
| ambar-email-token | String | <p>User token</p> |
### Success Response
@ -70,12 +93,18 @@ HTTP/1.1 404 Not Found
```
File meta or content not found
```
## Get Parsed Text From File by Meta Id
## Get Parsed Text From File by File Id
GET api/files/direct/:fileId/text
GET api/files/direct/:metaId/text
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email</p> |
| ambar-email-token | String | <p>User token</p> |
### Success Response
@ -91,7 +120,7 @@ HTTP/1.1 404 Not Found
```
File meta or content not found
```
## Get File Content by Secure Uri
## Download File Content by Secure Uri
@ -112,7 +141,7 @@ HTTP/1.1 404 Not Found
```
File meta or content not found
```
## Get Parsed Text by Secure Uri
## Download Parsed Text by Secure Uri
@ -139,6 +168,12 @@ File meta or content not found
POST api/files/:sourceId/:filename
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email</p> |
| ambar-email-token | String | <p>User token</p> |
### Examples
@ -156,8 +191,7 @@ http://ambar_api_address/api/files/Default/test.txt \
HTTP/1.1 200 OK
```
HTTP/1.1 200 OK
Json format: { metaId: xxxxx }
{ "fileId": xxxxx }
```
### Error Response
@ -171,6 +205,60 @@ HTTP/1.1 404 Not Found
```
File meta or content not found
```
## Hide File
<p>Hide file by file id</p>
PUT api/files/hide/:fileId
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email</p> |
| ambar-email-token | String | <p>User token</p> |
### Success Response
HTTP/1.1 200 OK
```
HTTP/1.1 200 OK
```
### Error Response
HTTP/1.1 404 NotFound
```
File not found
```
## Unhide File
<p>Unhide file by file id</p>
PUT api/files/unhide/:fileId
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email</p> |
| ambar-email-token | String | <p>User token</p> |
### Success Response
HTTP/1.1 200 OK
```
HTTP/1.1 200 OK
```
### Error Response
HTTP/1.1 404 NotFound
```
File not found
```
# Search
## Search For Documents By Query
@ -179,6 +267,12 @@ File meta or content not found
GET api/search
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email.</p> |
| ambar-email-token | String | <p>User token.</p> |
### Parameters
@ -201,46 +295,53 @@ curl -i http://ambar_api_address/api/search?query=John
HTTP/1.1 200 OK
```
{
"total":1,
"hits":[
{
"score": 0.5184899160927989,
"sha256": "318be2290125e0a6cfb7229133ba3c4632068ae04942ed5c7c660718d9d41eb3",
"meta": [
{
"extension": ".txt",
"full_name": "//winserver/share/englishstories/aesop11.txt",
"updated_datetime": "2017-01-13 14:06:20.098",
"indexed_datetime": "2017-04-03 11:21:10.064",
"extra": [],
"short_name": "aesop11.txt",
"id": "b84b6d7cf88655e9916d5fbb67886b1befa0d00d99f58b62e72fb04b51ff3c31",
"source_id": "Books",
"created_datetime": "2017-01-13 14:06:20.026",
"download_uri": "b41c4aaa2999ce42957f087db8e7608970efcedb1eaa40c28336390ecb5373849c955f395258f3dfd7482d4b84d543cdcc27104c934cd4efdb0ba8c54e6e8e5f3367190091fa4db779fe2097565921e69be43e80068893bafa0dd0b98f90ec96469df54050dee2649b68646824da2cc32061c24a7ed93a6d1514a89a75360551267015c3035515bbb2971a1fcfdf456a"
}
"sha256":"60a777c59176e98efee98bf16b67983dc981ec4da3eaafcb4d79046d005456f9",
"meta":{
"id":"ac8965ab5e07582e0e57cde0e7c4c2d49b955f8b26c779903191893fcb942fa4",
"full_name":"//mail.nic.ru/hello@ambar.cloud/linus torvalds talk of tech innovation is bullshit shut up and get the work done fcc chairman wants it to be easier to listen to free fm radio on your smartphone.eml",
"short_name":"linus torvalds talk of tech innovation is bullshit shut up and get the work done fcc chairman wants it to be easier to listen to free fm radio on your smartphone.eml",
"extension":".eml",
"extra":[
],
"source_id":"AmbarEmail",
"created_datetime":"2017-02-17 09:22:44.000",
"updated_datetime":"2017-02-17 09:22:44.000",
"download_uri":"b41c4aaa2999ce42957f087db8e7608970efcedb1eaa40c28336390ecb5373849c955f395258f3dfd7482d4b84d543cdfc23cff8df311276a5e111c0504315c60b159cd2fe2cee20c5470789d9d15e4d7e5fb7c2bc60c29bf9a578e47541fb354dcb5109e49ea9019b2d68c3b35e521a418d9c94f0af55dc79c2442188f039c924d0190c72f488ad77647f2a52aaa267"
},
"indexed_datetime":"2017-05-31 13:36:40.400",
"file_id":"aa5e000fd79cfed0e839af7073e1ef135e128408f984b9a8e70e34242b49f01a",
"content":{
"size": 229353,
"indexed_datetime": "2017-04-03 11:21:10.163",
"author": "",
"processed_datetime": "2017-04-03 11:21:10.163",
"size":49282,
"author":"Slashdot Headlines <slashdot@newsletters.slashdot.org>",
"ocr_performed":false,
"processed_datetime":"2017-05-31 13:36:40.361",
"length":"",
"language":"",
"thumb_available":false,
"state":"processed",
"title":"",
"type": "text/plain; charset=windows-1252"
"type":"message/rfc822",
"highlight":{
"text":[
"taking no notice of the grain. <br/>The Mule which had been robbed and wounded bewailed his<br/>misfortunes. The other replied, \"I am indeed glad that I was<br/>thought so little of, for I have lost nothing, nor am I hurt with<br/>any wound.\" <br/>The Viper and the File <br/>A LION, entering the workshop of a <em>smith</em>, sought from the tools<br/>the means of satisfying his hunger. He more particularly<br/>addressed himself to a File, and asked of him the favor of a<br/>meal. The File replied, \"You must indeed be a simple-minded<br/>fellow if you expect to get anything from me, who am accustomed<br/>to take from everyone, and",
"Aesop, by some strange accident it seems to have entirely<br/>disappeared, and to have been lost sight of. His name is<br/>mentioned by Avienus; by Suidas, a celebrated critic, at the<br/>close of the eleventh century, who gives in his lexicon several<br/>isolated verses of his version of the fables; and by <em>John</em><br/>Tzetzes, a grammarian and poet of Constantinople, who lived<br/>during the latter half of the twelfth century. Nevelet, in the<br/>preface to the volume which we have described, points out that<br/>the Fables of Planudes could not be the work of Aesop, as they<br/>contain a reference in two places to \"Holy"
"__________________________________________________________________________<br/>Linus Torvalds: Talk of Tech Innovation is Bullshit. Shut Up and Get the Work Done<br/>http://clicks.slashdot.org/ct.html?ufl=6&rtr=on&s=x8pb08,2qzsp,10sc,d9zf,fh0y,9dml,a0z3<br/><em>Elon Musk</em> Is <em>Really Boring</em><br/>http://clicks.slashdot.org/ct.html?ufl=6&rtr=on&s=x8pb08,2qzsp,10sc,a5s3,9k63,9dml,a0z3<br/>FCC Chairman Wants It To Be Easier To Listen To Free FM Radio On Your Smartphone<br/>http://clicks.slashdot.org/ct.html?ufl=6&rtr=on&s=x8pb08,2qzsp",
"self-serving. From a report on The Register: The term of art he used was more blunt: \"The innovation the industry talks about so much is... Read More http://clicks.slashdot.org/ct.html?ufl=6&rtr=on&s=x8pb08,2qzsp,10sc,aiki,d8f2,9dml,a0z3<br/><em>Elon Musk</em> Is <em>Really Boring</em> http://clicks.slashdot.org/ct.html?ufl=6&rtr=on&s=x8pb08,2qzsp,10sc,ezm,35uk,9dml,a0z3<br/>From the boring-company department<br/>Sometimes it is hard to tell if Elon Musk is serious about the things he says. But as for his \"boring\" claims, that's",
"email to: unsubscribe-47676@elabs10.com<br/>Slashdot | 1660 Logan Ave. Ste A | San Diego, CA 92113<br/>To view our Privacy Policy: http://clicks.slashdot.org/ct.html?ufl=6&rtr=on&s=x8pb08,2qzsp,10sc,8pii,7uiv,9dml,a0z3<br/><em>Elon Musk</em> Is <em>Really Boring</em> | Lost Winston Churchill Essay Reveals His Thoughts On Alien<br/>Life<br/>All the Power of a Windows 10 PC Right In Your Pocket<br/>As the world gets more advanced, technology is getting",
"WiFi and Bluetooth. Plus, with<br/>a wide range of inputs and outputs, you can link with just about any device you want. Learn More!<br/>Linus Torvalds: Talk of Tech Innovation is Bullshit. Shut Up and Get the Work Done <br/><em>Elon Musk</em> Is <em>Really Boring</em> <br/>FCC Chairman Wants It To Be Easier To Listen To Free FM Radio On Your Smartphone <br/>Lost Winston Churchill Essay Reveals His Thoughts On Alien Life <br/>JavaScript Attack Breaks ASLR On 22 CPU Architectures <br/>Ethicists",
"of innovation is smug, self-congratulatory, and self-serving. From a report on The Register: The term of art he used was more blunt: \"The innovation the industry talks about so much is...<br/><em>Elon Musk</em> Is <em>Really Boring</em> <br/>From the boring-company department<br/>Sometimes it is hard to tell if Elon Musk is serious about the things he says. But as for his \"boring\" claims, that's really happening. In a wide-range interview with Bloomberg"
]
}
},
"tags":[
],
"score":1
}
],
"took": 438.818418
"took":24.672135
}
```
### Error Response
@ -256,6 +357,12 @@ HTTP/1.1 400 BadRequest
GET api/search/:sha
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email.</p> |
| ambar-email-token | String | <p>User token.</p> |
### Parameters
@ -300,6 +407,12 @@ HTTP/1.1 400 BadRequest
GET api/sources/
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email.</p> |
| ambar-email-token | String | <p>User token.</p> |
### Success Response
@ -308,9 +421,7 @@ HTTP/1.1 200 OK
```
[
{
"_id": "58dce2754795070012ba2d42",
"id": "Default",
"index_name": "d033e22ae348aeb5660fc2140aec35850c4da997",
"description": "Automatically created on UI upload",
"type": "bucket"
},
@ -334,6 +445,12 @@ HTTP/1.1 200 OK
GET api/stats
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email</p> |
| ambar-email-token | String | <p>User token</p> |
### Success Response
@ -384,6 +501,126 @@ HTTP/1.1 200 OK
}
}
```
# Tags
## Delete Tag From File
DELETE api/tags/:fileId/:tagName
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email</p> |
| ambar-email-token | String | <p>User token</p> |
### Parameters
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| fileId | String | <p>File Id to delete tag from.</p> |
| tagName | String | <p>Tag name to delete.</p> |
### Success Response
HTTP/1.1 200 OK
```
{
"tags":[
{
"name":"ocr",
"filesCount":3
},
{
"name":"test",
"filesCount":2
},
{
"name":"pdf",
"filesCount":1
}
]
}
```
## Get Tags
GET api/tags/
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email</p> |
| ambar-email-token | String | <p>User token</p> |
### Success Response
HTTP/1.1 200 OK
```
[
{
"name":"ocr",
"filesCount":3
},
{
"name":"test",
"filesCount":2
},
{
"name":"pdf",
"filesCount":1
}
]
```
## Add Tag For File
POST api/tags/:fileId/:tagName
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email</p> |
| ambar-email-token | String | <p>User token</p> |
### Parameters
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| fileId | String | <p>File Id to add tag to.</p> |
| tagName | String | <p>Tag name to add.</p> |
### Success Response
HTTP/1.1 200 OK
```
{
"tagId":"e9536a83e64ff03617ab0379d835ac7bbf213bafb95cb42907a56e735472d4fc",
"tags":[
{
"name":"ocr",
"filesCount":3
},
{
"name":"test",
"filesCount":2
},
{
"name":"pdf",
"filesCount":1
}
]
}
```
# Thumbnails
## Get Thumbnail by Id
@ -428,4 +665,72 @@ HTTP/1.1 400 Bad Request
```
Request body is empty
```
# Users
## Login
POST api/users/login
### Parameters
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| email | String | <p>User Email</p> |
| password | String | <p>User Password</p> |
### Success Response
HTTP/1.1 200 OK
```
{
"token": "504d44935c2ccefb557fd49636a73239147b3895db2f2f...",
"ttl": "604800"
}
```
### Error Response
HTTP/1.1 400 BadRequest
```
Bad request
```
HTTP/1.1 404 NotFound
```
User with specified email not found
```
HTTP/1.1 409 Conflict
```
User is not in active state
```
HTTP/1.1 401 Unauthorized
```
Wrong password
```
## Logout
POST api/users/logout
### Headers
| Name | Type | Description |
|---------|-----------|--------------------------------------|
| ambar-email | String | <p>User email</p> |
| ambar-email-token | String | <p>User token</p> |
### Error Response
HTTP/1.1 401 Unauthorized
```
Unauthorized
```

@ -1,6 +1,49 @@
Change log
==========
0.10.0 (2017-06-02)
-------------------
### New features
#### Tags:
- Add tags just by typing them right into a search result with a smart autosuggest. Tags are separated by comma or `Enter` key
![Ambar Tags Adding](https://habrastorage.org/web/22b/9c3/0e7/22b9c30e7be14f94983bc46007280aa9.png)
- Search by one or several tags with `tags:ocr,receipt` query
![Ambar Search By Tag](https://habrastorage.org/web/bd5/a5b/928/bd5a5b928b6f4617a50c249a6799d0c7.png)
- `ocr` tag is automatically added for ocr-proccessed files, more auto tags are coming soon
#### Files removing:
- Now you can hide irrelevant search results with *Hide* button, so they never display again in your search results
![Hide File Button](https://habrastorage.org/web/7fb/d0b/7d9/7fbd0b7d96ce4d3286f51132ac0bde72.png)
- You can search through hidden files by `show:hidden` query
![Search Through Hidden Files](https://habrastorage.org/web/02e/351/0d5/02e3510d5e2746faac226aa0dae6a604.png)
- You can restore hidden files with *Restore* button
#### UI:
- Last modified date is now displayed in a human readable format
- Search result card design was significantly changed
- Main menu was placed on the right side of the header
### Bugfixes
- Fixed igonoring `auth:none` in `config.json` bug
- Other minor bug fixes
### Migration from 0.9.5
- Before updating your Ambar to this version, you need to reset all the data in your Ambar with `sudo ./ambar.py reset`. After reset run `sudo ./ambar.py update` to get the latest version.
0.9.5 (2017-05-19)
-------------------

@ -1,4 +1,4 @@
[![Version](https://img.shields.io/badge/Version-v0.9.5-brightgreen.svg)](https://ambar.cloud)
[![Version](https://img.shields.io/badge/Version-v0.10.0-brightgreen.svg)](https://ambar.cloud)
[![License](https://img.shields.io/badge/License-Fair%20Source%20v0.9-blue.svg)](https://github.com/RD17/ambar/blob/master/License.txt)
[![StackShare](https://img.shields.io/badge/tech-stack-0690fa.svg?style=flat)](https://stackshare.io/ambar/ambar)
@ -26,6 +26,7 @@ Ambar is a document search engine with automated crawling, OCR, deduplication an
* Search By File Path (filename:\*.txt)
* Search By Date (when: yesterday, today, lastweek, etc)
* Search By Size (size>1M)
* Search By Tags (tags:ocr)
* Search As You Type
* Supported language analyzers: English `ambar_en`, Russian `ambar_ru`, German `ambar_de`, Italian `ambar_it`, Polish `ambar_pl`, Chinese `ambar_cn`, CJK `ambar_cjk`
@ -35,7 +36,6 @@ Ambar is a document search engine with automated crawling, OCR, deduplication an
* [Mail Crawling](https://blog.ambar.cloud/crawling-and-searching-email-inbox-with-ambar/)
* [Dropbox Crawling](https://blog.ambar.cloud/how-to-search-through-your-dropbox-files-content/)
* Scheduled Crawling (Cron schedule syntax)
* Files Deduplication
### Content Extraction
* Extract content from large files (>30M)
@ -53,6 +53,8 @@ Ambar is a document search engine with automated crawling, OCR, deduplication an
### General
[Ambar features overview (Vimeo)](https://vimeo.com/202204412)
* Files Tagging
* Hiding Irrelevant Search Results
* Files Preview (with Google Docs View)
* Real-Time Statistics
* Web UI
@ -88,14 +90,14 @@ Yes, almost every Ambar's module is published on GitHub under [Fair Source Licen
Yes, Community Edition is forever free. We will NOT charge a penny from you to use it. Basic cloud account is also forever free.
### Does it perform OCR?
Yes, it performs OCR on images (jpg, tiff, bmp, etc) and PDF's. OCR is perfomed by well-known open-source library Tesseract. We tuned it to achieve best perfomance and quality on scanned documents.
Yes, it performs OCR on images (jpg, tiff, bmp, etc) and PDF's. OCR is perfomed by well-known open-source library Tesseract. We tuned it to achieve best perfomance and quality on scanned documents. You can easily find all files on which OCR was perfomed with `tags:ocr` query
### Which languages are supported for OCR?
Supported languages: Eng, Rus, Ita, Deu, Fra, Spa.
If you miss your language, please create an issue on GitHub and we'll add it ASAP.
### Does it support tagging?
Nope, we working on it. As a workaround you can use folders hierarchy as a set of tags.
Yes!
### What about searching in PDF?
Yes, it can search through any PDF, even badly encoded or with scans inside. We did our best to make search over any kind of pdf document smooth.

@ -5,6 +5,8 @@
Ambar: Proste Zarządzanie Dokumentami
================================
**OUTDATED - USE README.MD AS MANUAL**
Jeśli Ambar Ci się podoba, daj :star:!
[Wstęp](https://ambar.cloud) | [Chmura](https://app.ambar.cloud) | [Blog](https://blog.ambar.cloud)

Loading…
Cancel
Save