I assume your goal is to translate this idiom to Datasets:
rdd.map(x => (x.someKey, x.someField))
.reduceByKey(_ + _)
// => returning an RDD of (KeyType, FieldType)
The nearest solution that I have found with the Dataset API looks like this:
ds.map(x => (x.someKey, x.someField)) // 
.reduceGroups((a, b) => (a._1, a._2 + b._2))
.map(_._2) // 
// => returning a Dataset of (KeyType, FieldType)
// Now, according to me, having a map before groupByKey is required
// to end up with the proper type in reduceGroups. After all, we do
// not want to reduce over the original type, but the FieldType.
//  required since reduceGroups converts back to Dataset[(K, V)]
// not knowing that our V's are already key-value pairs.
Doesn't look very elegant and according to a quick benchmark it is also much less performant, so maybe we are missing something here...
Note: An alternative might be to use groupByKey(_.someKey) as a first step. But here the main problem is that using groupByKey changes the type from a regular Dataset to a KeyValueGroupedDataset. The latter doesn’t carries a regular map function. Instead it offers an mapGroups, which does not seem very convenient because it wraps the values into an Iterator and performs a shuffle according to the docstring.