mardi 6 janvier 2015

Syncing different nosql datasources efficiently


I have two dbs I am dealing with right now - Elasticsearch and Mongodb. Mongodb is legacy in the application in that I am not happy for its presence but it is there and in use, I am interested in moving completely to ElasticSearch but is probably not feasible to do for various non-technical reasons at the moment.


I have 3rd party product data indexed in elasticsearch. They were inside of MongoDb at some point but moved to elasticsearch because of needs of full text search etc.


These products can be included in user widgets, widgets are stored in Mongodb. The full product is put into a data field in the widget. The same product can be found in many different widgets. This was for performance purposes - since the widgets are served on other sites where there can be a number of widgets we did not want to have to look up each possible widget and each possible product in each widget.


so now I need to solve the problem of updating content in the widgets when products are updated. I suppose I can track changes to elasticsearch ( because that is really the current best place for me to know it but maybe some other solution would be better)


Really there are important things and less important things to track - the url of a product, the price of a product and how many are in stock are important and if those change then changes should cascade through system.


So I'm hoping for suggestions on how to proceed - things I thought:




  1. have some third db that handles versioning, so this theoretical system would have to efficiently change the product data inside widgets when changes to products in it happen.




  2. Use elasticsearch changes plugin to detect changes to data there and then sync.




But the other problem for is an efficient mongodb update to widgets, since for performance purposes embedding of products in widgets was chosen now the problem becomes how to update and how to do so in a way that scales ( scaling in this assumes probably 7-8 products per widget will probably be the average - generally not more than 12, but maybe thousands of products changing every day and those products potentially across many widgets)


both algorithm suggestions but also more concrete how to deal with this issue in Mongodb suggestions welcome.





Aucun commentaire:

Enregistrer un commentaire