java - How do you access related entries co-located with data affinity in Hazelcast? -
i trying use hazelcast's map-reduce feature perform aggregate operation, needs access co-located entries. co-location controlled using data-affinity.
imagine classic customer/order model used in hazelcast documentation on data-affinity. in example want return customer summary has customer , sum of orders, example, given data set:
customer_id | name ------------------ 1 | dave 2 | kate order_id | customer_id | value ------------------------------ 1 | 1 | 5 2 | 1 | 10 3 | 2 | 12
i want return:
customer_id | name | value -------------------------- 1 | dave | 15 2 | kate | 12
this simple enough, reason using data-affinity able perform summing logic within respective partition holding data getting orders within partition , therefore avoiding cross jvm communication.
and question, within mapper or similar, how co-located entries in cache?
edit:
after @noctarius' answer , comments, here's code (i've tried make brief possible) highlights point @ want orders current partition.
the order key class looks this:
public class orderkey implements partitionaware<customeridentity> { ... @override public customeridentity getpartitionkey() { return this.customeridentity; } ... }
and mapper
this:
public class ordersummapper implements mapper<customerkey, customer, customerkey, customerordertotal>, hazelcastinstanceaware { ... @override public void map(customerkey customerkey, customer customer, context<customerkey, customerordertotal> context) { predicate ordersforcustomer = new ordersforcustomerpredicate(customerkey); int totalvalue = 0; //****************************************************************** // // given orders co-located customer, how ensure // call orders runs in current partition? // //****************************************************************** (order order : hazelcastinstance.getmap("orders").values(ordersforcustomer)) { totalvalue += order.getvalue(); } context.emit(customerkey, new customerordertotal(customer, total)); } ... }
the highlighted call hazelcastinstance.getmap("orders").values(ordersforcustomer)
ordinarily hit nodes in cluster, because data co-located unncessary overhead.
and me original question, how orders such in current partition returned?
you inject current node's hazelcastinstance mapper , retrieve second data structure read data.
see basic example here: https://github.com/noctarius/hazelcast-mapreduce-presentation/blob/master/src/main/java/com/hazelcast/examples/tutorials/impl/salarymapper.java
Comments
Post a Comment