的Java 8流映射分组操作

问题描述:

我有以下两类:的Java 8流映射分组操作

Person

public class Person { 

    private final Long id; 
    private final String address; 
    private final String phone; 

    public Person(Long id, String address, String phone) { 
     this.id = id; 
     this.address = address; 
     this.phone = phone; 
    } 

    public Long getId() { 
     return id; 
    } 

    public String getAddress() { 
     return address; 
    } 

    public String getPhone() { 
     return phone; 
    } 

    @Override 
    public String toString() { 
     return "Person [id=" + id + ", address=" + address + ", phone=" + phone + "]"; 
    } 
} 

CollectivePerson

import java.util.HashSet; 
import java.util.Set; 

public class CollectivePerson { 

    private final Long id; 
    private final Set<String> addresses; 
    private final Set<String> phones; 

    public CollectivePerson(Long id) { 
     this.id = id; 
     this.addresses = new HashSet<>(); 
     this.phones = new HashSet<>(); 
    } 

    public Long getId() { 
     return id; 
    } 

    public Set<String> getAddresses() { 
     return addresses; 
    } 

    public Set<String> getPhones() { 
     return phones; 
    } 

    @Override 
    public String toString() { 
     return "CollectivePerson [id=" + id + ", addresses=" + addresses + ", phones=" + phones + "]"; 
    } 
} 

我想有流操作,以便:

  • The Person映射到CollectivePerson
  • addressPersonphoneCollectivePerson分别合并成addressesphones对于具有相同id

所有Person的I写了下面的代码段用于此目的:

import java.util.ArrayList; 
import java.util.HashMap; 
import java.util.List; 
import java.util.Map; 
import java.util.Objects; 
import java.util.stream.Collectors; 

public class Main { 

    public static void main(String[] args) { 
     Person person1 = new Person(1L, "Address 1", "Phone 1"); 
     Person person2 = new Person(2L, "Address 2", "Phone 2"); 
     Person person3 = new Person(3L, "Address 3", "Phone 3"); 
     Person person11 = new Person(1L, "Address 4", "Phone 4"); 
     Person person21 = new Person(2L, "Address 5", "Phone 5"); 
     Person person22 = new Person(2L, "Address 6", "Phone 6"); 

     List<Person> persons = new ArrayList<>(); 
     persons.add(person1); 
     persons.add(person11); 
     persons.add(person2); 
     persons.add(person21); 
     persons.add(person22); 
     persons.add(person3); 

     Map<Long, CollectivePerson> map = new HashMap<>(); 
     List<CollectivePerson> collectivePersons = persons.stream() 
       .map((Person person) -> { 
        CollectivePerson collectivePerson = map.get(person.getId()); 

        if (Objects.isNull(collectivePerson)) { 
         collectivePerson = new CollectivePerson(person.getId()); 
         map.put(person.getId(), collectivePerson); 

         collectivePerson.getAddresses().add(person.getAddress()); 
         collectivePerson.getPhones().add(person.getPhone()); 

         return collectivePerson; 
        } else { 
         collectivePerson.getAddresses().add(person.getAddress()); 
         collectivePerson.getPhones().add(person.getPhone()); 

         return null; 
        } 
       }) 
       .filter(Objects::nonNull) 
       .collect(Collectors.<CollectivePerson>toList()); 

     collectivePersons.forEach(System.out::println); 
    } 
} 

它做的工作和输出为:

CollectivePerson [id=1, addresses=[Address 1, Address 4], phones=[Phone 1, Phone 4]] 
CollectivePerson [id=2, addresses=[Address 2, Address 6, Address 5], phones=[Phone 5, Phone 2, Phone 6]] 
CollectivePerson [id=3, addresses=[Address 3], phones=[Phone 3]] 

但我相信有可能是一个更好的办法,分组来实现相同的流路。任何指针都会很棒。

而是操纵外部Map,你应该使用一个收藏家。有toMapgroupingBy,都允许解决这个问题,尽管由于你的类设计有点冗长。主要的障碍是缺乏的现有方法之一,合并一个PersonCollectivePerson或构建体来自给定Person实例或方法的CollectivePerson用于合并两个CollectivePerson实例。

一种方法用做内置收藏家将

List<CollectivePerson> collectivePersons = persons.stream() 
    .map(p -> { 
     CollectivePerson cp = new CollectivePerson(p.getId()); 
     cp.getAddresses().add(p.getAddress()); 
     cp.getPhones().add(p.getPhone()); 
     return cp; 
    }) 
    .collect(Collectors.collectingAndThen(Collectors.toMap(
     CollectivePerson::getId, Function.identity(), 
     (cp1, cp2) -> { 
      cp1.getAddresses().addAll(cp2.getAddresses()); 
      cp1.getPhones().addAll(cp2.getPhones()); 
      return cp1; 
     }), 
     m -> new ArrayList<>(m.values()) 
    )); 

,但在这种情况下,一个自定义的收集器可能更简单:

Collection<CollectivePerson> collectivePersons = persons.stream() 
    .collect(
     HashMap<Long,CollectivePerson>::new, 
     (m,p) -> { 
      CollectivePerson cp=m.computeIfAbsent(p.getId(), CollectivePerson::new); 
      cp.getAddresses().add(p.getAddress()); 
      cp.getPhones().add(p.getPhone()); 
     }, 
     (m1,m2) -> m2.forEach((l,cp) -> m1.merge(l, cp, (cp1,cp2) -> { 
      cp1.getAddresses().addAll(cp2.getAddresses()); 
      cp1.getPhones().addAll(cp2.getPhones()); 
      return cp1; 
     }))).values(); 

双方将来自一个预定义的方法中受益合并两个CollectivePerson实例,而第一个变体也将受益于CollectivePerson(Long id, Set<String> addresses, Set<String> phones)构造函数或更好,CollectivePerson(Person p)构造函数,而第二个将受益于CollectivePerson.add(Person p)方法...

请注意,第二个变体在不复制的情况下返回Map s值的Collection视图。如果您确实需要List,则可以像使用装订器功能中的第一个变型那样简单地使用new ArrayList<>(«map» .values())来简化合同。

+0

谢谢你,我采取了你的第二个变种,它非常快。在约10秒钟内操作~1000000个'人员'。 –

可以使用Collectors.toMap与合并功能:

public static <T, K, U, M extends Map<K, U>> 
Collector<T, ?, M> toMap(Function<? super T, ? extends K> keyMapper, 
          Function<? super T, ? extends U> valueMapper, 
          BinaryOperator<U> mergeFunction, 
          Supplier<M> mapSupplier) 

的映射是这样的:

Map<Long,CollectivePerson> collectivePersons = 
    persons.stream() 
     .collect(Collectors.toMap (Person::getId, 
            p -> { 
             CollectivePerson cp = new CollectivePerson (p.getId()); 
             cp.getAddresses().add (p.getAddress()); 
             cp.getPhones().add(p.getPhone()); 
             return cp; 
            }, 
            (cp1,cp2) -> { 
             cp1.getAddresses().addAll(cp2.getAddresses()); 
             cp1.getPhones().addAll(cp2.getPhones()); 
             return cp1; 
            }, 
            HashMap::new)); 

您可以用方便地提取从MapList<CollectivePerson>

new ArrayList<>(collectivePersons.values()) 

这是输出Map为您的样品输入:

{1=CollectivePerson [id=1, addresses=[Address 1, Address 4], phones=[Phone 1, Phone 4]], 
2=CollectivePerson [id=2, addresses=[Address 2, Address 6, Address 5], phones=[Phone 5, Phone 2, Phone 6]], 
3=CollectivePerson [id=3, addresses=[Address 3], phones=[Phone 3]]} 
+1

没有必要指定'HashMap :: new';这个任务并不要求map是'HashMap'的一个实例... – Holger

+0

@Holger你是对的。出于某种原因,我认为采用合并功能的'toMap'唯一变体也需要供应商。也就是说,我只是注意到,实现3参数映射(即没有供应商)是'返回地图(keyMapper,valueMapper,mergeFunction,HashMap :: new);':) – Eran

+1

是的,在目前的实施中,它总是产生一个'HashMap',就像'toList()'总是产生一个'ArrayList'一样,然而,这并不能保证,如果没有要求得到这些类型的确切实例,你应该允许实现改变任何有利于变革的承诺。反过来说,没有一种方法可以在不需要合并功能的情况下获取地图供应商。 – Holger

使用groupBy收藏家分组您的人!

List<CollectivePerson> list = persons.stream().collect(Collectors.groupingBy(Person::getId)).entrySet().stream().map(x -> { 
    // map all the addresses from the list of persons sharing the same id 
    Set<String> addresses = x.getValue().stream().map(Person::getAddress).collect(Collectors.toSet()); 
    // map all the phones from the list of persons sharing the same id 
    Set<String> phones = x.getValue().stream().map(Person::getPhone).collect(Collectors.toSet()); 
    // declare this constructor that takes three parameters 
    return new CollectivePerson(x.getKey(), addresses, phones); 
}).collect(Collectors.toList()); 

对于这个工作,你需要添加此构造:

public CollectivePerson(Long id, Set<String> addresses, Set<String> phones) { 
    this.id = id; 
    this.addresses = addresses; 
    this.phones = phones; 
} 

Map<Long, CollectivePerson> map = persons.stream(). 
      collect(Collectors.groupingBy(Person::getId, 
        Collectors.collectingAndThen(Collectors.toList(), 
          Main::downColl))); 

使用用于从具有相同id的人的列表创建CollectivePerson对象的方法的参考。

public static CollectivePerson downColl(List<Person> ps) { 

    CollectivePerson cp = new CollectivePerson(ps.get(0).getId());   
    for (Person p:ps) { 
     cp.getAddresses().add(p.getAddress()); 
     cp.getPhones().add(p.getPhone()); 
    } 
    return cp; 
}