Elasticsearch Bulk API : Cannot post more than one record

elastic bulk
malformed action/metadata line (1), expected start_object or end_object but found (value_string)
elasticsearch bulk python
elasticsearch bulk api java
elasticsearch bulk api limit
elasticsearch bulk update by query
elasticsearch populate index
malformed action metadata line 1 expected start_object or end_object but found value_null

I am trying to post the following using the bulk api. I have ES 2.2.0

{"index":{"_index":"junktest","_type":"test"}}
{"DocumentID":"555662","Tags":["B","C","D"],"Summary":"Summary Text","Status":"Review","Location":"HDFS","Error":"None","Author":"Abc Mnb","Sector":"Energy","Created Date":"2013-05-23"},
{"DocumentID":"555663","Tags":["A","B","C"],"Summary":"Summary Text","Status":"Review","Location":"HDFS","Error":"None","Author":"Abc Mnb","Sector":"Energy","Created Date":"2013-04-25"}

as

curl -XPOST "http://localhost:9200/_bulk" --data-binary @post.json

but i get

  {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Malformed
     action/metadata line [3], expected START_OBJECT or END_OBJECT but found [VALUE_
STRING]"}],"type":"illegal_argument_exception","reason":"Malformed action/metadata line [3], expected START_OBJECT or END_OBJECT but found [VALUE_STRING]"},"status":400}

why is }, invalid? i have even tried it without the comma but i still get the error , even though i do not have a , !

What is wrong with my syntax?

Edit

I was able to get it to work by

{"index":{"_index":"junktest","_type":"test"}}
{"DocumentID":"555662","Tags":["B","C","D"],"Summary":"Summary Text","Status":"Review","Location":"HDFS","Error":"None","Author":"Abc Mnb","Sector":"Energy","Created Date":"2013-05-23"}
{"index":{"_index":"junktest","_type":"test"}}
{"DocumentID":"555663","Tags":["A","B","C"],"Summary":"Summary Text","Status":"Review","Location":"HDFS","Error":"None","Author":"Abc Mnb","Sector":"Energy","Created Date":"2013-04-25"}

is this the only way to index multiple records using the bulk api?

From the documentation

The REST API endpoint is /_bulk, and it expects the following JSON structure:

action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n

the document source is optional but the action_meta_data is mandatory and the two are separated by new lines. Given these constraints you can only speicify one record per action.

Also in the example you provided you are not passing "_id" in the meta data which would mean the "_id" is auto-generated . Probably it is intentional but just remember in case you intend to update a document you would not be able to use the DocumentId.

Bulk API | Elasticsearch Reference [7.6], From the documentation. The REST API endpoint is /_bulk, and it expects the following JSON structure: action_and_meta_data\n optional_source\n  I am trying to use the elasticsearch bulk api to insert multiple records into an index. My JSON looks something like this: request json I am inserting a new line (\ ) at the end of the document but

a line break after the last record is important to get it working. I solved a similar problem by adding \n (line break) at the end of last record.

e.g. This will not work :

"{ "name":"Central School", "description":"CBSE Affiliation", "street":"Nagan"}"

but this will

"{ "name":"Central School", "description":"CBSE Affiliation", "street":"Nagan"} \n "

Bulk API Rejected records but I can't see a reason, Performs multiple indexing or delete operations in a single API call. POST /<​index>/_bulk There is no "correct" number of actions to perform in a single bulk request. (Optional, string) Removes the specified document from the index. operations cannot complete successfully, the API returns a response with an errors  The bulk API makes it possible to perform many index/delete operations in a single API call. This can greatly increase the indexing speed. Some of the officially supported clients provide helpers to assist with bulk requests and reindexing of documents from one index to another: See elasticsearch.helpers.* The REST API endpoint is /_bulk, and

There should be no line break in JSON structure , previously I was doing with this input, It was producing the above error,

{ 
"index" : { "_index" : "ecommerce", "_type" : "product", "_id" : "1002" 
} 
}
{ "id": 2}

Now it is working fine with the following input

{ "index" : { "_index" : "ecommerce", "_type" : "product", "_id" : "1002" } }
{ "id": 2}
{"index":{ "_index" : "ecommerce", "_type" : "product","_id":"1003"}}
{ "id": 3,"name":"Dot net"}

Bulk API | Elasticsearch Reference [6.8], On most of my batches, I am getting an Invalid NEST response error. Bulk API Rejected records but I can't see a reason from a successful low level call on POST: /powermta-general-2017.03.13/_bulk # Invalid Bulk items:  In ElasticSearch: There is a max http request size in the ES GitHub code, and it is set against Integer.MAX_VALUE or 2^31-1. So, basically, 2GB is the maximum document size for bulk indexing over HTTP. And also to add to it, ES does not process an HTTP request until it completes. Good Practices:

Multi Search API | Elasticsearch Reference [7.6], Client support for bulk requests. Some of the officially supported clients provide helpers to assist with bulk requests and reindexing of documents from one index​  Just like when setting it on the _update_by_query API, requests_per_second can be either -1 to disable throttling or any decimal number like 1.7 or 12 to throttle to that level. Rethrottling that speeds up the query takes effect immediately, but rethrotting that slows down the query will take effect after completing the current batch.

Document maximum size for bulk indexing over HTTP · Issue #2237 , The multi search API executes several searches from a single API request. The format of the request is similar to the bulk API format and makes use of the can limit the number of shards significantly if for instance a shard can not match any You can specify multiple indices as an array. preference: (Optional, string) Node​  The task ID can be found using the tasks API.. Just like when setting it on the Reindex API, requests_per_second can be either -1 to disable throttling or any decimal number like 1.7 or 12 to throttle to that level.

Elasticsearch: Bulk Inserting Examples, There seems to be a maximum record count for bulk indexing of about elastic / elasticsearch from netty docs on http://docs.jboss.org/netty/3.2/api/org/jboss/​netty/ multiple HttpChunks whose length is maxChunkSize at maximum. [​POST] URL [https:///_bulk] You can't perform that action at this time. The value provided must be a numeric, long value greater than or equal to 0, and less than around 9.2e+18. When using the external version type, the system checks to see if the version number passed to the index request is greater than the version of the currently stored document. If true, the document will be indexed and the new version number

Comments
  • yeah i guess it is the only way. this is more useful for multiple updates on several indices i guess. i was not including the ids on purpose. thanks