This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Mon, Jan 29, 2007 at 09:57:29AM -0800, Stuart Hastings wrote: > Last ping is here: > > http://gcc.gnu.org/ml/gcc-patches/2007-01/msg00299.html > > Patch awaiting review is here: > > http://gcc.gnu.org/ml/gcc-patches/2006-12/msg01541.html I had a look at this, and there are a number of things wrong with it. Rather than iterate over this lots of times, I fixed them up; the result is attached. In summary: I'm not fond of TARGET_KEEPS_VECTOR_ALIGNED_STACK preventing these, but it's a reasonable hack for now. It'd be nice if we could prove these vector values never got spilled; if the user is using SSE on their own, then we can piggy-back on that and use it too. Or, as I think I saw Jan mention somewhere in the thread, expand these as post-reload splitters. Which doesn't seem like it'd be *that* hard to do cleanly. Failing that, we could arrange for reload to emit movups, et al, when spilling. All of which is not really relevant for Darwin, so we'll ignore it for now. I added a comment about the HImode conversions. You added a "floatunsdidf3" expander, which would be ignored. It must be merged with the 64-bit "floatunsdidf2" expander above. The movdi_to_sse pattern used which_alternative, but only had one alternative. It was clearly intended to have a second alternative for load from memory, but got it wrong. Ideally, this would be done in the regular movdi pattern that allows inter-unit moves, but we don't actually have one of those. I think they were disabled because it made reload do Truely Horrible Things with register class choices. We might could revisit that at some point... I'd rather tackle x86_inter_unit_moves in a separate patch. All uses of ix86_fp_from_int_const used ULL suffixes, which is not valid for C90, which we're supposed to support. I was able to replace all instances with real_ldexp. I was able to merge the SF and DF to SI conversion routines. I think it's almost possible to write this without using actual vector modes at all, except for the actual cvttpd2dq insn. We need to ensure that the high fp values are valid (prefereably zero). That's easy to do with vector insns coming up to it, so it seems silly to hack in a "xorps xmm0, xmm0; movss xmm1, xmm0" before the actual conversion. Again, something that wouldn't be relevant if we expanded this post-reload. Calling gen_rtx_SUBREG directly is generally bad practice. In all of these instances we wanted gen_lowpart or gen_highpart. I did do away with the clamping in the fp to uint conversions; the C standard is quite clear that out-of-range values can produce any result. I'm not sure the purpose of store_xmm_as_DF. In what cases does ix86_expand_vector_extract do the wrong thing? During initial rtl expansion, it's almost always a mistake to select between MEM and REG. Many REGs will have been loaded from REGs, and combine could be convinced to put it back if you give it half a chance. I *think* my changes to ix86_expand_convert_uns_didf_sse in this regard are correct, but I havn't done exhaustive testing. ix86_expand_convert_uns_sidf_sse didn't need any vector ops. The unsigned sidf and signed didf operations would be interesting to expand after reload, Just In Case the value happened to be in SSE registers or memory to begin with. For instance, with this patch we currently generate cvtsi2sd 20(%esp), %xmm1 mulsd .LC7, %xmm1 { 0x1.0p32 } movl 16(%esp), %eax subl $-2147483648, %eax cvtsi2sd %eax, %xmm0 addsd .LC3, %xmm0 { 0x1.0p31 } addsd %xmm0, %xmm1 Add two more insns if we disallow inter-unit moves. Best case with SSE3 might look like movq 16(%esp), %xmm0 psubd LC0, %xmm0 { INT_MIN, 0, 0, 0 } cvtpi2pd %xmm0, %xmm0 addsd LC1, %xmm0 { 0x1.0p31 } mulpd LC2, %xmm0 { 1.0, 0x1.0p32 } haddpd %xmm0, %xmm0 I've only tested this lightly so far. I'll add a hack to force linux to assume aligned stacks (true during bootstrap at least) and build there. r~
Attachment:
z
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |