On Fri, Apr 01, 2016 at 02:47:05PM +0100, Wilco Dijkstra wrote:
Evandro Menezes wrote:
Ping^1
I haven't seen a newer version that incorporates my feedback. To recap what
I'd like to see is a more general way to select approximations based on mode.
I don't believe that looking at the inner mode works in general, and it
doesn't make sense to add internal tune flags for all possible combinations.
Agreed. I don't think that a flag for each of the cartesian product of
{rsqrt,sqrt,div} X {SF,DF,V2SF,V4SF,V2DF} is a scalable solution - that's
at least 15 flags we'll need.
As I said earlier in the discussion, this particular split (between SF and
DF mode) seems strange to me. I'd expect the V4SF vs. SF would also be
interesting, and that a distinction between vector modes and scalar
modes would be more likely to be useful.
To give an idea what I mean, it would be easiest to add a single field to the
CPU tuning structure that contains a mask for all the combinations. Then we
call a single function with approximation kind ie. sqrt, rsqrt, div (x/y),
recip (1/x) and mode which uses the CPU tuning field to decide whether it
should be inlined.
I like the idea of a single cost function.