It is not possible to yield multiple RDDs from a single transformation*. If you want to split an RDD you have to apply a filter for each split condition. For example:
def even(x): return x % 2 == 0
def odd(x): return not even(x)
rdd = sc.parallelize(range(20))
rdd_odd, rdd_even = (rdd.filter(f) for f in (odd, even))
If you have only a binary condition and computation is expensive you may prefer something like this:
kv_rdd = rdd.map(lambda x: (x, odd(x)))
rdd_odd = kv_rdd.filter(lambda kv: kv).keys()
rdd_even = kv_rdd.filter(lambda kv: not kv).keys()
It means only a single predicate computation but requires additional pass overall data.
It is important to note that as long as an input RDD is properly cached and there no additional assumptions regarding data distribution there is no significant difference when it comes to time complexity between repeated filter and for-loop with nested if-else.
If you want to know more about PySpark, then do check out this awesome video tutorial: