Hot questions for Spring Data MongoDB

Hot questions for Spring Data MongoDB

Top 10 Java Open Source / Spring / Spring Data MongoDB

Question:

For this example from the MongoDB documentation, how do I write the query using MongoTemplate?

db.sales.aggregate(
   [
      {
        $group : {
           _id : { month: { $month: "$date" }, day: { $dayOfMonth: "$date" }, year: { $year: "$date" } },
           totalPrice: { $sum: { $multiply: [ "$price", "$quantity" ] } },
           averageQuantity: { $avg: "$quantity" },
           count: { $sum: 1 }
        }
      }
   ]
)

Or in general, how do I group by a calculated field?


Answer:

You can actually do something like this with "project" first, but to me it's a little counter-intuitive to require a $project stage before hand:

    Aggregation agg = newAggregation(
        project("quantity")
            .andExpression("dayOfMonth(date)").as("day")
            .andExpression("month(date)").as("month")
            .andExpression("year(date)").as("year")
            .andExpression("price * quantity").as("totalAmount"),
        group(fields().and("day").and("month").and("year"))
            .avg("quantity").as("averavgeQuantity")
            .sum("totalAmount").as("totalAmount")
            .count().as("count")
    );

Like I said, counter-intuitive as you should just be able to declare all of this under $group stage, but the helpers don't seem to work this way. The serialization comes out a bit funny ( wraps the date operator arguments with arrays ) but it does seem to work. But still, this is two pipeline stages rather than one.

What is the problem with this? Well by separating the stages the stages the "project" portion forces the processing of all of the documents in the pipeline in order to get the calculated fields, that means it passes through everything before moving on to the group stage.

The difference in processing time can be clearly seen by running the queries in both forms. With a separate project stage, on my hardware takes three times longer to execute than the query where all fields are calculated during the "group" operation.

So it seems the only present way to construct this properly is by building the pipeline object yourself:

    ApplicationContext ctx =
            new AnnotationConfigApplicationContext(SpringMongoConfig.class);
    MongoOperations mongoOperation = (MongoOperations) ctx.getBean("mongoTemplate");

    BasicDBList pipeline = new BasicDBList();
    String[] multiplier = { "$price", "$quantity" };

    pipeline.add(
        new BasicDBObject("$group",
            new BasicDBObject("_id",
                new BasicDBObject("month", new BasicDBObject("$month", "$date"))
                    .append("day", new BasicDBObject("$dayOfMonth", "$date"))
                    .append("year", new BasicDBObject("$year", "$date"))
            )
            .append("totalPrice", new BasicDBObject(
                "$sum", new BasicDBObject(
                    "$multiply", multiplier
                )
            ))
            .append("averageQuantity", new BasicDBObject("$avg", "$quantity"))
            .append("count",new BasicDBObject("$sum",1))
        )
    );

    BasicDBObject aggregation = new BasicDBObject("aggregate","collection")
        .append("pipeline",pipeline);

    System.out.println(aggregation);

    CommandResult commandResult = mongoOperation.executeCommand(aggregation);

Or if all of that seems to terse to you, then you can always work with the JSON source and parse that. But of course, it has to be valid JSON:

    String json = "[" +
        "{ \"$group\": { "+
            "\"_id\": { " +
                "\"month\": { \"$month\": \"$date\" }, " +
                "\"day\": { \"$dayOfMonth\":\"$date\" }, " +
                "\"year\": { \"$year\": \"$date\" } " +
            "}, " +
            "\"totalPrice\": { \"$sum\": { \"$multiply\": [ \"$price\", \"$quantity\" ] } }, " +
            "\"averageQuantity\": { \"$avg\": \"$quantity\" }, " +
            "\"count\": { \"$sum\": 1 } " +
        "}}" +
    "]";

    BasicDBList pipeline = (BasicDBList)com.mongodb.util.JSON.parse(json);

Question:

My MongoDB json structure is

 {
    "_id" : "122134231234234",
    "name" : "Total_pop",
    "description" : "sales category",
    "source" : "public",
    "dataset" :"d1"


},
{
    "_id" : "1123421231234234",
    "name" : "Total_pop",
    "description" : "sales category",
    "source" : "public",
    "dataset" :"d1"


},
{
    "_id" : "12312342332423343",
    "name" : "Total_pop",
    "description" : "sales category",
    "source" : "private",
    "description" : "d1"
}

I need to get collection distinct of dataset where source is public. I tried this query, and it didn't work:

Criteria criteria = new Criteria();
criteria.where("source").in("public");     
query.addCriteria(criteria);
query.fields().include("name");
query.fields().include("description");
query.fields().include("description");
query.fields().include("source"); List list =
mongoTemplate.getCollection("collectionname").distinct("source", query);

Can you please help me out?


Answer:

For one thing the .getCollection() method returns the basic Driver collection object like so:

DBCollection collection = mongoTemplate.getCollection("collectionName");

So the type of query object might be different from what you are using, but there are also some other things. Namely that .distinct() only returns the "distint" values of the key that you asked for, and doe not return other fields of the document. So you could do:

Criteria criteria = new Criteria();
criteria.where("dataset").is("d1");
Query query = new Query();
query.addCriteria(criteria);
List list = mongoTemplate.getCollection("collectionName")
    .distinct("source",query.getQueryObject());

But that is only going to return "sample" as a single element in the list for instance.

If you want the "fields" from a distinct set then use the .aggregate() method instead. With either the "first" occurances of the other field values for the distinct key:

    DBCollection colllection = mongoTemplate.getCollection("collectionName");

    List<DBObject> pipeline = Arrays.<DBObject>asList(
        new BasicDBObject("$match",new BasicDBObject("dataset","d1")),
        new BasicDBObject("$group",
            new BasicDBObject("_id","$source")
                .append("name",new BasicDBObject("$first","$name"))
                .append("description", new BasicDBObject("$first","$description"))
        )
    );

    AggregationOutput output = colllection.aggregate(pipeline);

Or the actual "distinct" values of multiple fields, by making them all part of the grouping key:

    DBCollection colllection = mongoTemplate.getCollection("collectionName");

    List<DBObject> pipeline = Arrays.<DBObject>asList(
        new BasicDBObject("$match",new BasicDBObject("dataset","d1")),
        new BasicDBObject("$group",
            new BasicDBObject("_id",
                new BasicDBObject("source","$source")
                    .append("name","$name")
                    .append("description","$description")
            )
        )
    );

    AggregationOutput output = colllection.aggregate(pipeline);

There are also a direct .aggregate() method on mongoTemplate instances already, which has a number of helper methods to build pipelines. But this should point you in the right direction at least.

Question:

This is my MongoDb structure,

db.user.find();

user:

{
"name" : "KSK", 
 "claim"  : [objectId("52ffc4a5d85242602e000000"),objectId("52ffc4a5d85242602e000001")] 
}

claim:

[
   {
     "_id" : "52ffc4a5d85242602e000001",
     "claimName" :"XXXX"
   },
   {
     "_id" : "52ffc4a5d85242602e000000",
     "claimName" :"YYY"
   }
]

My Entity class is:

@Document(collection="user")
public  class User{
  @Id      
  private String id;
  private String name; 
  @DBRef
private List<Claim> claim; 
// setter and getter 

}

Claim Class:

@Document(collection="Claim")
public class Claim{
   @Id 
   private String id; 
   private String claimName;   
}

I have a method to get the users by name like below,

public User findByName(String name);

If I try to hit this method am getting an error that,

No converter found capable of converting from type org.bson.types.ObjectId to type java.lang.String

So I changed my User entity class as like below,

Instead of private List<Claim> claim;

Changed as Private List<ObjectId> claim;

Now if I execute a method(findByName), I get a user object that has both claim object ids ("52ffc4a5d85242602e000001","52ffc4a5d85242602e000000") then iterate the claim list and get the claim details corresponding to the claim object Id.

Instead of doing this, when I execute findByName method I want to get a user and claim details. How can I achieve this functionality


Answer:

If you reference your Claims in the User class with @DBRef, your JSON should not only contain the ID but the reference to the collection where to find the ID as well, like this:

{
  "name" : "KSK", 
  "claim"  : [ 
     { "$ref":"claim", // the target collection
       "$id : "ObjectId("52ffc4a5d85242602e000000")
     }
  ] 
}

That is how Spring-Data maps your Java objects to MongoDB. If you start with a blank database and let Spring create and save the relations, you should have no problems using

 @DBRef List<Claim> claims;

Question:

I have a function that receive an Aggregation aggregation as a param.

I would like to get all AggregationOperation from aggregation. Is there any way to do it?

public Aggregation newCustomAggregation(Aggregation aggregation, Criteria c) {
    // How to get list operation aggregation there?
    listOperation.push(Aggregation.match(c));
    return Aggregation
            .newAggregation(listOperations);
}

My purpose is new another Aggregation with my custom MatchAggregation.


Answer:

You can create your own custom aggregation implementation by subclassing the aggregation to access the protected operations field.

Something like

public class CustomAggregation extends Aggregation {
      List<AggregationOperation> getAggregationOperations() {
      return operations;
   }
}

public Aggregation newCustomAggregation(Aggregation aggregation, Criteria c) {
     CustomAggregation customAggregation = (CustomAggregation) aggregation;
     List<AggregationOperation> listOperations = customAggregation.getAggregationOperations();
     listOperations.add(Aggregation.match(c));
     return Aggregation .newAggregation(listOperations);
 }

Question:

I have the model like following

@CompoundIndexes(value = {
        @CompoundIndex(name = "catalog_idx", def = "{'code' : 1, 'brand' : 1}", unique = true) })
@Document(collection = Catalog.ENTITY)
public class Catalog extends AbstractModel<String> {

    private static final long serialVersionUID = 1L;

    public static final String ENTITY = "catalog";

    @NotNull(message = "Code is required")
    @Field("code")
    private String code;

    @NotNull(message = "Brand is required")
    @DBRef(lazy = true)
    @Field("brand")
    private Brand brand;
}

When i do save with mongoTemplate.save(object); i see only 2 objects created in DB instead of 6. Just before save my debug lines for objects to be saved.

Catalog [code=StagedCatalog, brand=Brand [code=Brand_3]]
Catalog [code=StagedCatalog, brand=Brand [code=Brand_2]]
Catalog [code=StagedCatalog, brand=Brand [code=Brand_1]]
Catalog [code=OnlineCatalog, brand=Brand [code=Brand_2]]
Catalog [code=OnlineCatalog, brand=Brand [code=Brand_1]]
Catalog [code=OnlineCatalog, brand=Brand [code=Brand_3]]

Any ideas why ? I feel the Index unique thing is not working somehow. I want code and brand to be unique combination.

public abstract class AbstractModel<ID extends Serializable> implements Serializable {

    private static final long serialVersionUID = 1L;

    @Id
    private ID id;
}

Answer:

You have set a unique index. It means that you will be unable to have 2 documents with the same code and brand.

Now you have set the ID column to ID object. The fact that you have 2 insert instead of 6 means that you use the same ID for 3 insert, something like :

for (code: {"StagedCatalog","OnlineCatalog"} ) {
    ID id=new ID(...);
    for (brand: {1, 2, 3}){
        Catalog cat=new Catalog();
        cat.setId(id);              // <<== this is wrong, you reuse the same id, you will insert first brand, then update to brand2 and brand3.
        cat.setCode(code);
        cat.setBrand(brand);
        mongoTemplate.persist(cat);
    }
}

To prevent that, you need to:

Catalog cat=new Catalog();
ID id=new ID(realUniqueId);  // RealuniqueId can be code+brand for instance
cat.setId(id); 
...

Question:

I am doing MongoDB aggregation. I want to lookup two collections then project only desired field in nested array.

Two collections to lookup:

db.pitcher.find().pretty()

{
         "_id" : ObjectId("59b22eeef224252e6c7eeaf6"),
        "userId" : "a0",
        "name" : "test50000",
        "index" : 50000,
        "position" : "SP",
        "order" : 0,
        "gameRecord" : [
                {
                        "seasonIndex" : 2017251,
                        "gameIndex" : 1,
                        "ERA" : 3.00,
                },
        {
                        "seasonIndex" : 2017251,
                        "gameIndex" : 2,
                        "ERA" : 4.50,
                }
        ]
       }

db.gameResult.find().pretty()

{
        "_id" : ObjectId("59b22b7dac48252e6c7eeaf6"),
        "seasonIndex" : 2017251,
        "gameIndex" : 1,
        "away" : "a9",
        "home" : "a0",
        "awayScore" : 9,
        "homeScore" : 4,
        "awayPitcherList" : [
                50180
        ],
        "homePitcherList" : [
                50000,
                50049,
                50048,
                50047
        ]
      }

Aggregate query:

> db.gameResult.aggregate([
{
    $match : {gameIndex : 1 ,home : "a0"}
},
{
    $lookup:
        {
        from: "pitcher",
        localField : "awayPitcherList",
        foreignField : "index",
        as: "awayPitcherList"
        }
},
{
    $lookup:
        {
        from: "pitcher",
        localField : "homePitcherList",
        foreignField : "index",
        as: "homePitcherList"
        }
}
]).pretty()

Finally desired Output:

"_id" : ObjectId("59b22b7dac48252e6c7eeaf6"),
"seasonIndex" : 2017251,
"gameIndex" : 1,
"away" : "a9",
"home" : "a0",
"awayScore" : 9,
"homeScore" : 4,

"awayPitcherList" : [
    {
    "name" : "test50180",
    "gameRecord" : [
        {
                "seasonIndex" : 2017251,
                "gameIndex" : 1,
                "ERA" : 3.00,
        }
]
],
"homePitcherList" : [
             {
    "name" : "test50000",
    "gameRecord" : [
        {
                "seasonIndex" : 2017251,
                "gameIndex" : 1,
                "ERA" : 3.00,
        }
],
             {
    "name" : "test50049",
    "gameRecord" : [
        {
                "seasonIndex" : 2017251,
                "gameIndex" : 1,
                "ERA" : 3.00,
        }
],
             {
    "name" : "test50048",
    "gameRecord" : [
        {
                "seasonIndex" : 2017251,
                "gameIndex" : 1,
                "ERA" : 3.00,
        }
],
            {
    "name" : "test50047",
    "gameRecord" : [
        {
                "seasonIndex" : 2017251,
                "gameIndex" : 1,
                "ERA" : 3.00,
        }
]
]

I want name and gameRecord which contains gameIndex of (in this case) 1 only.

Please improve my aggregate query. Many many tnx for Spring code if you have one.


Answer:

You can use the following query in 3.4.

The below query uses $addFields to overwrite the existing awayPitcherList with the updated awayPitcherList which includes name and gameRecord.

$map stage to keep the name field and $filter to filter the gameRecord to retain only matching gameIndex element.

Similar aggregation for homePitcherList.

db.gameResult.aggregate(
[
  {
    "$match": {
      "gameIndex": 1,
      "home": "a0"
    }
  },
  {
    "$lookup": {
      "from": "pitcher",
      "localField": "awayPitcherList",
      "foreignField": "index",
      "as": "awayPitcherList"
    }
  },
  {
    "$addFields": {
      "awayPitcherList": {
        "$map": {
          "input": "$awayPitcherList",
          "as": "awayPitcher",
          "in": {
            "name": "$$awayPitcher.name",
            "gameRecord": {
              "$filter": {
                "input": "$$awayPitcher.gameRecord",
                "as": "gameRecord",
                "cond": {
                  "$eq": [
                    "$$gameRecord.gameIndex",
                    1
                  ]
                }
              }
            }
          }
        }
      }
    }
  },
  {
    "$lookup": {
      "from": "pitcher",
      "localField": "homePitcherList",
      "foreignField": "index",
      "as": "homePitcherList"
    }
  },
  {
    "$addFields": {
      "homePitcherList": {
        "$map": {
          "input": "$homePitcherList",
          "as": "homePitcher",
          "in": {
            "name": "$$homePitcher.name",
            "gameRecord": {
              "$filter": {
                "input": "$$homePitcher.gameRecord",
                "as": "gameRecord",
                "cond": {
                  "$eq": [
                    "$$gameRecord.gameIndex",
                    1
                  ]
                }
              }
            }
          }
        }
      }
    }
  }
])

Use below aggregate query for 3.2.

 db.gameResult.aggregate(
    [
      {
        "$match": {
          "gameIndex": 1,
          "home": "a0"
        }
      },
      {
        "$lookup": {
          "from": "pitcher",
          "localField": "awayPitcherList",
          "foreignField": "index",
          "as": "awayPitcherList"
        }
      },
      {
        "$project": {
          "homePitcherList":1,
          "awayPitcherList": {
            "$map": {
              "input": "$awayPitcherList",
              "as": "awayPitcher",
              "in": {
                "name": "$$awayPitcher.name",
                "gameRecord": {
                  "$filter": {
                    "input": "$$awayPitcher.gameRecord",
                    "as": "gameRecord",
                    "cond": {
                      "$eq": [
                        "$$gameRecord.gameIndex",
                        1
                      ]
                    }
                  }
                }
              }
            }
          }
        }
      },
      {
        "$lookup": {
          "from": "pitcher",
          "localField": "homePitcherList",
          "foreignField": "index",
          "as": "homePitcherList"
        }
      },
      {
        "$project": {
          "awayPitcherList":1,
          "homePitcherList": {
            "$map": {
              "input": "$homePitcherList",
              "as": "homePitcher",
              "in": {
                "name": "$$homePitcher.name",
                "gameRecord": {
                  "$filter": {
                    "input": "$$homePitcher.gameRecord",
                    "as": "gameRecord",
                    "cond": {
                      "$eq": [
                        "$$gameRecord.gameIndex",
                        1
                      ]
                    }
                  }
                }
              }
            }
          }
        }
      }
    ])

Question:

When serializing from Java to JSON, Jackson generates an extra target property for referenced entities when using the Spring Data MongoDB @DBRef annotation with lazy loading and Jackson’s polymorphic type handling. Why does this occur, and is it possible to omit the extra target property?

Code Example

@Document(collection = "cdBox")
public class CDBox {
  @Id
  public String id;

  @DBRef(lazy = true)
  public List<Product> products;
}

@Document(collection = "album")
public class Album extends Product {
  @DBRef(lazy = true)
  public List<Song> songs;
}

@Document(collection = "single")
public class Single extends Product {
  @DBRef(lazy = true)
  public List<Song> songs;
}

@Document(collection = "song")
public class Song {
  @Id
  public String id;

  public String title;
}

@JsonTypeInfo(use = JsonTypeInfo.Id.NAME,
                    property = "productType",
                    include = JsonTypeInfo.As.EXTERNAL_PROPERTY)
@JsonSubTypes(value = {
    @JsonSubTypes.Type(value = Single.class),
    @JsonSubTypes.Type(value = Album.class)
})
public abstract class Product {
  @Id
  public String id;
}

Generated JSON

{
  "id": "someId1",
  "products": [
    {
      "id": "someId2",
      "songs": [
        {
        "id": "someId3",
        "title": "Some title",
        "target": {
          "id": "someId3",
          "title": "Some title"
          }
        }
      ]
    }
  ]
}

Answer:

The Target field is added by Spring Data because it is a lazy collection. So it is like datahandler etc. in Hibernate for JPA.

Option1: To ignore them you just have to add @JsonIgnoreProperties(value = { "target" }) on class level

@Document(collection = "song")
@JsonIgnoreProperties(value = { "target" })
public class Song {
 ...
}

Option2: Make the Collection not lazy

Question:

I want to find users by most recent date (Assume the User object has a date field). The data is stored in MongoDB and accessed via a Spring MongoTemplate.

Example of the raw data:

{userId:1, date:10}
{userId:1, date:20}
{userId:2, date:50}
{userId:2, date:10}
{userId:3, date:10}
{userId:3, date:30}

The query should return

 {{userId:1, date:20}, {userId:2, date:50}, {userId:3, date:30}}

The aggregation method Ï am using is

db.table1.aggregate({$group:{'_id':'$userId', 'max':{$max:'$date'}}}, 
{$sort:{'max':1}}).result

Answer:

You could Sort it first by date DESC and select the first while grouping by userID

final Aggregation aggregation = newAggregation(
    Aggregation.sort(Sort.Direction.DESC, "date"),
    Aggregation.group("userId").first("date").as("Date")
);

final AggregationResults<User> results = mongoTemplate.aggregate(aggregation, "user", User.class);