From 95030af933f46fd3404882d63a1d3ecf2f24cc8e Mon Sep 17 00:00:00 2001 From: Ilya P Date: Fri, 2 Jun 2017 12:06:06 +0300 Subject: [PATCH] Updated Documentation and Readme --- API_DOC.md | 419 ++++++++++++++++++++++++++++++++++++++++++++------- CHANGELOG.md | 43 ++++++ README.md | 10 +- README_PL.md | 2 + 4 files changed, 413 insertions(+), 61 deletions(-) diff --git a/API_DOC.md b/API_DOC.md index f5d83df..91e0ae0 100644 --- a/API_DOC.md +++ b/API_DOC.md @@ -3,12 +3,14 @@ Ambar Web API documentation - [Files](#files) - - [Get Meta by Meta Id](#get-meta-by-meta-id) - - [Get File Source by Meta Id](#get-file-source-by-meta-id) - - [Get Parsed Text From File by Meta Id](#get-parsed-text-from-file-by-meta-id) - - [Get File Content by Secure Uri](#get-file-content-by-secure-uri) - - [Get Parsed Text by Secure Uri](#get-parsed-text-by-secure-uri) + - [Get File Meta by File Id](#get-file-meta-by-file-id) + - [Get File Source by File Id](#get-file-source-by-file-id) + - [Get Parsed Text From File by File Id](#get-parsed-text-from-file-by-file-id) + - [Download File Content by Secure Uri](#download-file-content-by-secure-uri) + - [Download Parsed Text by Secure Uri](#download-parsed-text-by-secure-uri) - [Upload File](#upload-file) + - [Hide File](#hide-file) + - [Unhide File](#unhide-file) - [Search](#search) - [Search For Documents By Query](#search-for-documents-by-query) @@ -20,20 +22,35 @@ Ambar Web API documentation - [Statistics](#statistics) - [Get Statistics](#get-statistics) +- [Tags](#tags) + - [Delete Tag From File](#delete-tag-from-file) + - [Get Tags](#get-tags) + - [Add Tag For File](#add-tag-for-file) + - [Thumbnails](#thumbnails) - [Get Thumbnail by Id](#get-thumbnail-by-id) - [Add or Update Thumbnail](#add-or-update-thumbnail) +- [Users](#users) + - [Login](#login) + - [Logout](#logout) + # Files -## Get Meta by Meta Id +## Get File Meta by File Id + + GET api/files/direct/:fileId/meta - GET api/files/direct/:metaId/meta +### Headers +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email

| +| ambar-email-token | String |

User token

| ### Success Response @@ -49,12 +66,18 @@ HTTP/1.1 404 Not Found ``` File meta or content not found ``` -## Get File Source by Meta Id +## Get File Source by File Id + + GET api/files/direct/:fileId/source - GET api/files/direct/:metaId/source +### Headers +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email

| +| ambar-email-token | String |

User token

| ### Success Response @@ -70,12 +93,18 @@ HTTP/1.1 404 Not Found ``` File meta or content not found ``` -## Get Parsed Text From File by Meta Id +## Get Parsed Text From File by File Id + + GET api/files/direct/:fileId/text - GET api/files/direct/:metaId/text +### Headers +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email

| +| ambar-email-token | String |

User token

| ### Success Response @@ -91,7 +120,7 @@ HTTP/1.1 404 Not Found ``` File meta or content not found ``` -## Get File Content by Secure Uri +## Download File Content by Secure Uri @@ -112,7 +141,7 @@ HTTP/1.1 404 Not Found ``` File meta or content not found ``` -## Get Parsed Text by Secure Uri +## Download Parsed Text by Secure Uri @@ -139,6 +168,12 @@ File meta or content not found POST api/files/:sourceId/:filename +### Headers + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email

| +| ambar-email-token | String |

User token

| ### Examples @@ -156,8 +191,7 @@ http://ambar_api_address/api/files/Default/test.txt \ HTTP/1.1 200 OK ``` -HTTP/1.1 200 OK -Json format: { metaId: xxxxx } +{ "fileId": xxxxx } ``` ### Error Response @@ -171,6 +205,60 @@ HTTP/1.1 404 Not Found ``` File meta or content not found ``` +## Hide File + +

Hide file by file id

+ + PUT api/files/hide/:fileId + +### Headers + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email

| +| ambar-email-token | String |

User token

| + +### Success Response + +HTTP/1.1 200 OK + +``` +HTTP/1.1 200 OK +``` +### Error Response + +HTTP/1.1 404 NotFound + +``` +File not found +``` +## Unhide File + +

Unhide file by file id

+ + PUT api/files/unhide/:fileId + +### Headers + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email

| +| ambar-email-token | String |

User token

| + +### Success Response + +HTTP/1.1 200 OK + +``` +HTTP/1.1 200 OK +``` +### Error Response + +HTTP/1.1 404 NotFound + +``` +File not found +``` # Search ## Search For Documents By Query @@ -179,6 +267,12 @@ File meta or content not found GET api/search +### Headers + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email.

| +| ambar-email-token | String |

User token.

| ### Parameters @@ -201,47 +295,54 @@ curl -i http://ambar_api_address/api/search?query=John HTTP/1.1 200 OK ``` -{ - "total": 1, - "hits": [ - { - "score": 0.5184899160927989, - "sha256": "318be2290125e0a6cfb7229133ba3c4632068ae04942ed5c7c660718d9d41eb3", - "meta": [ - { - "extension": ".txt", - "full_name": "//winserver/share/englishstories/aesop11.txt", - "updated_datetime": "2017-01-13 14:06:20.098", - "indexed_datetime": "2017-04-03 11:21:10.064", - "extra": [], - "short_name": "aesop11.txt", - "id": "b84b6d7cf88655e9916d5fbb67886b1befa0d00d99f58b62e72fb04b51ff3c31", - "source_id": "Books", - "created_datetime": "2017-01-13 14:06:20.026", - "download_uri": "b41c4aaa2999ce42957f087db8e7608970efcedb1eaa40c28336390ecb5373849c955f395258f3dfd7482d4b84d543cdcc27104c934cd4efdb0ba8c54e6e8e5f3367190091fa4db779fe2097565921e69be43e80068893bafa0dd0b98f90ec96469df54050dee2649b68646824da2cc32061c24a7ed93a6d1514a89a75360551267015c3035515bbb2971a1fcfdf456a" - } - ], - "content": { - "size": 229353, - "indexed_datetime": "2017-04-03 11:21:10.163", - "author": "", - "processed_datetime": "2017-04-03 11:21:10.163", - "length": "", - "language": "", - "thumb_available": false, - "state": "processed", - "title": "", - "type": "text/plain; charset=windows-1252" - "highlight": { - "text": [ - "taking no notice of the grain.
The Mule which had been robbed and wounded bewailed his
misfortunes. The other replied, \"I am indeed glad that I was
thought so little of, for I have lost nothing, nor am I hurt with
any wound.\"
The Viper and the File
A LION, entering the workshop of a smith, sought from the tools
the means of satisfying his hunger. He more particularly
addressed himself to a File, and asked of him the favor of a
meal. The File replied, \"You must indeed be a simple-minded
fellow if you expect to get anything from me, who am accustomed
to take from everyone, and", - "Aesop, by some strange accident it seems to have entirely
disappeared, and to have been lost sight of. His name is
mentioned by Avienus; by Suidas, a celebrated critic, at the
close of the eleventh century, who gives in his lexicon several
isolated verses of his version of the fables; and by John
Tzetzes, a grammarian and poet of Constantinople, who lived
during the latter half of the twelfth century. Nevelet, in the
preface to the volume which we have described, points out that
the Fables of Planudes could not be the work of Aesop, as they
contain a reference in two places to \"Holy" - ] - } + +{ + "total":1, + "hits":[ + { + "sha256":"60a777c59176e98efee98bf16b67983dc981ec4da3eaafcb4d79046d005456f9", + "meta":{ + "id":"ac8965ab5e07582e0e57cde0e7c4c2d49b955f8b26c779903191893fcb942fa4", + "full_name":"//mail.nic.ru/hello@ambar.cloud/linus torvalds talk of tech innovation is bullshit shut up and get the work done fcc chairman wants it to be easier to listen to free fm radio on your smartphone.eml", + "short_name":"linus torvalds talk of tech innovation is bullshit shut up and get the work done fcc chairman wants it to be easier to listen to free fm radio on your smartphone.eml", + "extension":".eml", + "extra":[ + ], + "source_id":"AmbarEmail", + "created_datetime":"2017-02-17 09:22:44.000", + "updated_datetime":"2017-02-17 09:22:44.000", + "download_uri":"b41c4aaa2999ce42957f087db8e7608970efcedb1eaa40c28336390ecb5373849c955f395258f3dfd7482d4b84d543cdfc23cff8df311276a5e111c0504315c60b159cd2fe2cee20c5470789d9d15e4d7e5fb7c2bc60c29bf9a578e47541fb354dcb5109e49ea9019b2d68c3b35e521a418d9c94f0af55dc79c2442188f039c924d0190c72f488ad77647f2a52aaa267" + }, + "indexed_datetime":"2017-05-31 13:36:40.400", + "file_id":"aa5e000fd79cfed0e839af7073e1ef135e128408f984b9a8e70e34242b49f01a", + "content":{ + "size":49282, + "author":"Slashdot Headlines ", + "ocr_performed":false, + "processed_datetime":"2017-05-31 13:36:40.361", + "length":"", + "language":"", + "thumb_available":false, + "state":"processed", + "title":"", + "type":"message/rfc822", + "highlight":{ + "text":[ + "__________________________________________________________________________
Linus Torvalds: Talk of Tech Innovation is Bullshit. Shut Up and Get the Work Done
http://clicks.slashdot.org/ct.html?ufl=6&rtr=on&s=x8pb08,2qzsp,10sc,d9zf,fh0y,9dml,a0z3
Elon Musk Is Really Boring
http://clicks.slashdot.org/ct.html?ufl=6&rtr=on&s=x8pb08,2qzsp,10sc,a5s3,9k63,9dml,a0z3
FCC Chairman Wants It To Be Easier To Listen To Free FM Radio On Your Smartphone
http://clicks.slashdot.org/ct.html?ufl=6&rtr=on&s=x8pb08,2qzsp", + "self-serving. From a report on The Register: The term of art he used was more blunt: \"The innovation the industry talks about so much is... Read More http://clicks.slashdot.org/ct.html?ufl=6&rtr=on&s=x8pb08,2qzsp,10sc,aiki,d8f2,9dml,a0z3
Elon Musk Is Really Boring http://clicks.slashdot.org/ct.html?ufl=6&rtr=on&s=x8pb08,2qzsp,10sc,ezm,35uk,9dml,a0z3
From the boring-company department
Sometimes it is hard to tell if Elon Musk is serious about the things he says. But as for his \"boring\" claims, that's", + "email to: unsubscribe-47676@elabs10.com
Slashdot | 1660 Logan Ave. Ste A | San Diego, CA 92113
To view our Privacy Policy: http://clicks.slashdot.org/ct.html?ufl=6&rtr=on&s=x8pb08,2qzsp,10sc,8pii,7uiv,9dml,a0z3
Elon Musk Is Really Boring | Lost Winston Churchill Essay Reveals His Thoughts On Alien
Life
All the Power of a Windows 10 PC Right In Your Pocket
As the world gets more advanced, technology is getting", + "WiFi and Bluetooth. Plus, with
a wide range of inputs and outputs, you can link with just about any device you want. Learn More!
Linus Torvalds: Talk of Tech Innovation is Bullshit. Shut Up and Get the Work Done
Elon Musk Is Really Boring
FCC Chairman Wants It To Be Easier To Listen To Free FM Radio On Your Smartphone
Lost Winston Churchill Essay Reveals His Thoughts On Alien Life
JavaScript Attack Breaks ASLR On 22 CPU Architectures
Ethicists", + "of innovation is smug, self-congratulatory, and self-serving. From a report on The Register: The term of art he used was more blunt: \"The innovation the industry talks about so much is...
Elon Musk Is Really Boring
From the boring-company department
Sometimes it is hard to tell if Elon Musk is serious about the things he says. But as for his \"boring\" claims, that's really happening. In a wide-range interview with Bloomberg" + ] + } + }, + "tags":[ + ], + "score":1 } - ], - "took": 438.818418 - } + ], + "took":24.672135 +} ``` ### Error Response @@ -256,6 +357,12 @@ HTTP/1.1 400 BadRequest GET api/search/:sha +### Headers + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email.

| +| ambar-email-token | String |

User token.

| ### Parameters @@ -300,6 +407,12 @@ HTTP/1.1 400 BadRequest GET api/sources/ +### Headers + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email.

| +| ambar-email-token | String |

User token.

| ### Success Response @@ -308,9 +421,7 @@ HTTP/1.1 200 OK ``` [ { - "_id": "58dce2754795070012ba2d42", "id": "Default", - "index_name": "d033e22ae348aeb5660fc2140aec35850c4da997", "description": "Automatically created on UI upload", "type": "bucket" }, @@ -334,6 +445,12 @@ HTTP/1.1 200 OK GET api/stats +### Headers + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email

| +| ambar-email-token | String |

User token

| ### Success Response @@ -384,6 +501,126 @@ HTTP/1.1 200 OK } } ``` +# Tags + +## Delete Tag From File + + + + DELETE api/tags/:fileId/:tagName + +### Headers + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email

| +| ambar-email-token | String |

User token

| + +### Parameters + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| fileId | String |

File Id to delete tag from.

| +| tagName | String |

Tag name to delete.

| + +### Success Response + +HTTP/1.1 200 OK + +``` +{ + "tags":[ + { + "name":"ocr", + "filesCount":3 + }, + { + "name":"test", + "filesCount":2 + }, + { + "name":"pdf", + "filesCount":1 + } + ] +} +``` +## Get Tags + + + + GET api/tags/ + +### Headers + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email

| +| ambar-email-token | String |

User token

| + +### Success Response + +HTTP/1.1 200 OK + +``` +[ + { + "name":"ocr", + "filesCount":3 + }, + { + "name":"test", + "filesCount":2 + }, + { + "name":"pdf", + "filesCount":1 + } +] +``` +## Add Tag For File + + + + POST api/tags/:fileId/:tagName + +### Headers + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email

| +| ambar-email-token | String |

User token

| + +### Parameters + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| fileId | String |

File Id to add tag to.

| +| tagName | String |

Tag name to add.

| + +### Success Response + +HTTP/1.1 200 OK + +``` +{ + "tagId":"e9536a83e64ff03617ab0379d835ac7bbf213bafb95cb42907a56e735472d4fc", + "tags":[ + { + "name":"ocr", + "filesCount":3 + }, + { + "name":"test", + "filesCount":2 + }, + { + "name":"pdf", + "filesCount":1 + } + ] +} +``` # Thumbnails ## Get Thumbnail by Id @@ -428,4 +665,72 @@ HTTP/1.1 400 Bad Request ``` Request body is empty ``` +# Users + +## Login + + + + POST api/users/login + + +### Parameters + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| email | String |

User Email

| +| password | String |

User Password

| + +### Success Response + +HTTP/1.1 200 OK + +``` +{ + "token": "504d44935c2ccefb557fd49636a73239147b3895db2f2f...", + "ttl": "604800" +} +``` +### Error Response + +HTTP/1.1 400 BadRequest + +``` +Bad request +``` +HTTP/1.1 404 NotFound + +``` +User with specified email not found +``` +HTTP/1.1 409 Conflict + +``` +User is not in active state +``` +HTTP/1.1 401 Unauthorized + +``` +Wrong password +``` +## Logout + + + + POST api/users/logout + +### Headers + +| Name | Type | Description | +|---------|-----------|--------------------------------------| +| ambar-email | String |

User email

| +| ambar-email-token | String |

User token

| + +### Error Response + +HTTP/1.1 401 Unauthorized + +``` +Unauthorized +``` diff --git a/CHANGELOG.md b/CHANGELOG.md index 7307587..5f22947 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,49 @@ Change log ========== +0.10.0 (2017-06-02) +------------------- + +### New features + +#### Tags: + - Add tags just by typing them right into a search result with a smart autosuggest. Tags are separated by comma or `Enter` key + + ![Ambar Tags Adding](https://habrastorage.org/web/22b/9c3/0e7/22b9c30e7be14f94983bc46007280aa9.png) + + - Search by one or several tags with `tags:ocr,receipt` query + +![Ambar Search By Tag](https://habrastorage.org/web/bd5/a5b/928/bd5a5b928b6f4617a50c249a6799d0c7.png) + + - `ocr` tag is automatically added for ocr-proccessed files, more auto tags are coming soon + +#### Files removing: + + - Now you can hide irrelevant search results with *Hide* button, so they never display again in your search results + + ![Hide File Button](https://habrastorage.org/web/7fb/d0b/7d9/7fbd0b7d96ce4d3286f51132ac0bde72.png) + + - You can search through hidden files by `show:hidden` query + + ![Search Through Hidden Files](https://habrastorage.org/web/02e/351/0d5/02e3510d5e2746faac226aa0dae6a604.png) + + - You can restore hidden files with *Restore* button + +#### UI: + - Last modified date is now displayed in a human readable format + - Search result card design was significantly changed + - Main menu was placed on the right side of the header + +### Bugfixes + + - Fixed igonoring `auth:none` in `config.json` bug + - Other minor bug fixes + +### Migration from 0.9.5 + + - Before updating your Ambar to this version, you need to reset all the data in your Ambar with `sudo ./ambar.py reset`. After reset run `sudo ./ambar.py update` to get the latest version. + + 0.9.5 (2017-05-19) ------------------- diff --git a/README.md b/README.md index b23bd29..0bfe06f 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -[![Version](https://img.shields.io/badge/Version-v0.9.5-brightgreen.svg)](https://ambar.cloud) +[![Version](https://img.shields.io/badge/Version-v0.10.0-brightgreen.svg)](https://ambar.cloud) [![License](https://img.shields.io/badge/License-Fair%20Source%20v0.9-blue.svg)](https://github.com/RD17/ambar/blob/master/License.txt) [![StackShare](https://img.shields.io/badge/tech-stack-0690fa.svg?style=flat)](https://stackshare.io/ambar/ambar) @@ -26,6 +26,7 @@ Ambar is a document search engine with automated crawling, OCR, deduplication an * Search By File Path (filename:\*.txt) * Search By Date (when: yesterday, today, lastweek, etc) * Search By Size (size>1M) +* Search By Tags (tags:ocr) * Search As You Type * Supported language analyzers: English `ambar_en`, Russian `ambar_ru`, German `ambar_de`, Italian `ambar_it`, Polish `ambar_pl`, Chinese `ambar_cn`, CJK `ambar_cjk` @@ -35,7 +36,6 @@ Ambar is a document search engine with automated crawling, OCR, deduplication an * [Mail Crawling](https://blog.ambar.cloud/crawling-and-searching-email-inbox-with-ambar/) * [Dropbox Crawling](https://blog.ambar.cloud/how-to-search-through-your-dropbox-files-content/) * Scheduled Crawling (Cron schedule syntax) -* Files Deduplication ### Content Extraction * Extract content from large files (>30M) @@ -53,6 +53,8 @@ Ambar is a document search engine with automated crawling, OCR, deduplication an ### General [Ambar features overview (Vimeo)](https://vimeo.com/202204412) +* Files Tagging +* Hiding Irrelevant Search Results * Files Preview (with Google Docs View) * Real-Time Statistics * Web UI @@ -88,14 +90,14 @@ Yes, almost every Ambar's module is published on GitHub under [Fair Source Licen Yes, Community Edition is forever free. We will NOT charge a penny from you to use it. Basic cloud account is also forever free. ### Does it perform OCR? -Yes, it performs OCR on images (jpg, tiff, bmp, etc) and PDF's. OCR is perfomed by well-known open-source library Tesseract. We tuned it to achieve best perfomance and quality on scanned documents. +Yes, it performs OCR on images (jpg, tiff, bmp, etc) and PDF's. OCR is perfomed by well-known open-source library Tesseract. We tuned it to achieve best perfomance and quality on scanned documents. You can easily find all files on which OCR was perfomed with `tags:ocr` query ### Which languages are supported for OCR? Supported languages: Eng, Rus, Ita, Deu, Fra, Spa. If you miss your language, please create an issue on GitHub and we'll add it ASAP. ### Does it support tagging? -Nope, we working on it. As a workaround you can use folders hierarchy as a set of tags. +Yes! ### What about searching in PDF? Yes, it can search through any PDF, even badly encoded or with scans inside. We did our best to make search over any kind of pdf document smooth. diff --git a/README_PL.md b/README_PL.md index 083605b..05880cc 100644 --- a/README_PL.md +++ b/README_PL.md @@ -5,6 +5,8 @@ Ambar: Proste Zarządzanie Dokumentami ================================ +**OUTDATED - USE README.MD AS MANUAL** + Jeśli Ambar Ci się podoba, daj :star:! [Wstęp](https://ambar.cloud) | [Chmura](https://app.ambar.cloud) | [Blog](https://blog.ambar.cloud)