Just a quick riff/hack on whether it’d be hard to make a collect()
method that “collected” into a Vec
without needing any turbofish (see, if you’re interested, my prior post on the turbofish
.
Some grasp of traits and iteration is required to comfortably get this … though it might be a fun dive even if you’re not
Background on collect
The implementation of collect
is:
fn collect<B: FromIterator<Self::Item>>(self) -> B
where
Self: Sized,
{
FromIterator::from_iter(self)
}
The generic type B
is bound by FromIterator
which basically enables a type to be constructed from an Iterator
. In other words, collect()
returns any type that can be built from an interator. EG, Vec
.
The reason the turbofish
comes about is that, as I said above, it returns “any type” that can be built from an iterator. So when we run something like:
let z = [1i32, 2, 3].into_iter().collect();
… we have a problem … rust, or the collect()
method has no idea what type we’re building/constructing.
More specifically, looking at the code for collect
, in the call of FromIterator::form_iter(self)
, which is calling the method on the trait directly, rust has no way to determine which implementation of the trait to use. The one on Vec
or HashMap
or String
etc??
Thus, the turbofish
syntax specifies the generic type B
which (somehow through type inference???) then determines which implementation to use.
let z = [1i32, 2, 3].into_iter().collect::<Vec<_>>();
IE: Use the implementation on Vec
!
Why not just use Vec
?
I figure Vec
is used so often as the type for collecting an Iterator
that it could be nice to have a convenient method.
The docs even hint at this by suggesting that calling the FromIterator::from_iter()
method directly from the desired type (eg Vec
) can be more readable (see FromIterator
docs).
EG … using collect
:
let d = [1i32, 2, 3];
let x = d.iter().map(|x| x + 100).collect::<Vec<_>>();
Using Vec::from_iter()
let y = Vec::from_iter(d.iter().map(|x| x + 100));
As Vec
is always in the prelude (IE, it’s always available), using from_iter
clearly seems like a nicer option here.
But you lose method chaining! So … how about a method on Iterator
, like collect
but for Vec
specifically? How would you make that and is it hard??
Making collect_vec()
It’s not hard actually
- Define a trait,
CollectVec
that defines a methodcollect_vec
which returnsVec<Self::Item>
- Make this a “sub-trait” of
Iterator
(or, makeIterator
the “supertrait”) so that theIterator::collect()
method is always available - Implement
CollectVec
for all types that implementIterator
by just callingself.collect()
… the type inference will take care of the rest, because it’s clear that aVec
will be used.
trait CollectVec: Iterator {
fn collect_vec(self) -> Vec<Self::Item>;
}
impl<I: Iterator> CollectVec for I {
fn collect_vec(self) -> Vec<Self::Item> {
self.collect()
}
}
With this you can then do the following:
let d = [1i32, 2, 3];
let d2 = d.iter().map(|x| x + 1).collect_vec();
Don’t know about you, but implementing such methods for the common collection types would suit me just fine … that turbofish is a pain to write … and AFAICT this isn’t inconsistent with rust’s style/design. And it’s super easy to implement … the type system handles this issue very well.
The idea & execution are great, I just don’t know that I would ever do this for collecting into
Vec
s myself.When I’m tired of writing turbofish, I usually just annotate the type for the binding of the “result”:
let d = [1i32, 2, 3]; let y: Vec<_> = d.iter().map(|x| x + 100).collect();
So often have I collected into a vector then later realized that I really wanted a map or set instead, that I prefer keeping the code “flat” and duplicated (i.e. we don’t “go into” a specific function) so that I can just swap out the
Vec
for aHashMap
orBTreeSet
when & where the need arises.So often have I collected into a vector then later realized that I really wanted a map or set instead, that I prefer keeping the code “flat” and duplicated (i.e. we don’t “go into” a specific function) so that I can just swap out …
Yea good point. And yea, annotating the binding rather than using the turbofish also seems more natural to me too.
In the end though, my motivation here was to see if I could, not whether I should! 😜
Though to be fair, I can see myself adding
collect_vec()
to a codebase if I know I will be collecting into a bunch of vecs. Just because it’s my little monster and I’m biased! And adding other methods for the other common collections probably wouldn’t be too hard??!!