- Sponsor
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
a test for when the clang compiler method signature mismatches with the julia ccall one #978
Conversation
Needs a rebase; then I will merge it. |
Is this the likely reason for the instability we see with clang in #1013? |
@vtjnash: can you fix this up? Should we merge it? |
@vtjnash can you check if this is fixed with Xcode 4.4? |
It wasn't fixed with clang 3.1 (although I feel I need to find a way to trigger this test more reliably). Really this needs to be fixed somehow in Julia / LLVM. |
Bump |
Should this be filed as a bug in clang? What should we do next? |
Sorry for the huge delay. I didn't know llvm well enough. The problem is with Julia not LLVM, I just didn't understand where or how to look previously. @JeffBezanson I've come to realize that method signatures are slightly incorrect. It is very important to tell llvm the sign of the arguments (signext or zeroext). I can show you why later / on IRC / in person (in short: llvm will remove the previous bit casts because it assumes they weren't necessary). GCC is much more forgiving regarding mismatching type signatures than clang. |
http://llvm.org/docs/LangRef.html#paramattrs |
I have to say it frequently seems to me like llvm should have just gone with having signed and unsigned integer types. Oh well. |
I'm kind of glad it doesn't have them, since they don't really exist. All that's really there is a bit string after all. @vtjnash do you think we should add the following:
|
@JeffBezanson: yes, that seems right, but you need to select zext or sext based on the sign of the argument. I agree with Stefan. Looking at how the code generator works, they should have just gone with explicitly signed and unsigned types. I've observed that llvm seems to mostly ignore explicit bit modification instructions, so they don't actually happen until just before the function call, and then they happen based upon this hinting. (they probably are observed more closely in the presence of load / store instructions, but I haven't investigated). Additionally, this looks like it is necessary for every function created in LLVM, even if they are purely JIT functions, for any bit type. Pointer types I think are OK without it. Also from that link above, most ccall functions would benefit from having the sspreq attribute set, to protect the stack. Also byval seems rather interesting, although I'm not sure if it can be any more efficient that the current implementation. |
I don't fully understand why we would need this, since in every call I generate the types match exactly, as required by the llvm verifier. And the type of the actual argument should matter more, since for example you would zero extend a uint8 to pass it as a signed int32. |
That's what this pull request was originally about. Correctly matching the types is not enough. The signature of the function must also be made to match, or (I've observed that) the assembly to guarantee conversion of the types will not be generated. And you can't pass an int8 to a function that requires an int32, their sizes don't match. Here's my final code testcase (in c first, for readability): int8_t ptr[300];
int8_t f2a(int8_t x) { return ptr[x]; }
int8_t f1a(int64_t x) { return f2a( (int8_t)(x&0xff) ); }
int8_t f2b(int8_t x) { return x+1; }
int8_t f1b(int64_t x) { return f2b( (int8_t)(x&0xff) ); }
int main() {
s1 = f1a( 5000 );
s2 = f2a( 5000 );
return (int)s2;
} (disclaimer: I just wrote this for readability, I didn't actually use and/or test it. It should generate something equivalent to the below LLVM, if the optimizations don't eliminate everything) Compiling this by hand into raw llvm opt codes (this is what I really was using): @ptr = global i8 300
define i8 @f2a(i8 zeroext %x) {
%1 = getelementptr i8* @ptr, i8 %x
%2 = load i8* %1, align 4
ret i8 %2
}
define i8 @f1a(i64 %A) {
%s1 = trunc i64 %A to i8
%s2 = call i8 @f2a(i8 %s1)
ret i8 %s2
}
define i8 @f2b(i8 zeroext %x) {
%1 = add i8 %x, 1
ret i8 %1
}
define i8 @f1b(i64 %A) {
%s1 = trunc i64 %A to i8
%s2 = call i8 @f2b(i8 %s1)
ret i8 %s2
}
define i32 @main() {
%s1 = call i8 @f1a(i64 5000)
%s2 = call i8 @f1b(i64 5000)
%s3 = sext i8 %s2 to i32
ret i32 %s3
} Experiment with removing and changing the various zeroext and truncate instructions and you will find that LLVM does not generate sufficient code to remove the extra bits if the type of the argument is not specified as zero/sign extended. Generate the associated assembly with .section __TEXT,__text,regular,pure_instructions
.globl _f2a
.align 4, 0x90
_f2a: ## @f2a
.cfi_startproc
## BB#0:
## kill: EDI<def> EDI<kill> RDI<def>
movsbq %dil, %rax
leaq _ptr(%rip), %rcx
movb (%rax,%rcx), %al
ret
.cfi_endproc
.globl _f1a
.align 4, 0x90
_f1a: ## @f1a
.cfi_startproc
## BB#0:
pushq %rax
Ltmp1:
.cfi_def_cfa_offset 16
movzbl %dil, %edi
callq _f2a
popq %rdx
ret
.cfi_endproc
.globl _f2b
.align 4, 0x90
_f2b: ## @f2b
.cfi_startproc
## BB#0:
incb %dil
movb %dil, %al
ret
.cfi_endproc
.globl _f1b
.align 4, 0x90
_f1b: ## @f1b
.cfi_startproc
## BB#0:
pushq %rax
Ltmp3:
.cfi_def_cfa_offset 16
movzbl %dil, %edi
callq _f2b
popq %rdx
ret
.cfi_endproc
.globl _main
.align 4, 0x90
_main: ## @main
.cfi_startproc
## BB#0:
pushq %rax
Ltmp5:
.cfi_def_cfa_offset 16
movl $5000, %edi ## imm = 0x1388
callq _f1a
movl $5000, %edi ## imm = 0x1388
callq _f1b
movsbl %al, %eax
popq %rdx
ret
.cfi_endproc
.section __DATA,__data
.globl _ptr ## @ptr
_ptr:
.byte 44 ## 0x2c
.subsections_via_symbols Note that if zeroext is removed, the movzbl instructions will also disappear. Then both of these functions will return an incorrect answer, despite the attempt at type matching! The argument types in the function did not require it, so the callers did not bother to clear the high bits. |
Ah, so an i8 argument with no attribute behaves as a word where the higher bits are just undefined? That strikes me as a bad way to avoid a single mov instruction. Alternatively, perhaps this can be seen as a flaw in getelementptr; it should require the offset to be extended to a pointer-size integer. |
Actually, julia might be less susceptible to this problem, since internally all array indexes are word-size integers, and any other integer types are explicitly converted first. I don't think we generate any GEPs with non-word-size integer offsets. |
I played with this some more and found (no big surprise here) that the important thing is to be consistent (I was analyzing the above assembly slightly incorrectly). Thus, if the function is declared with zeroext in the parameter type, then LLVM will generate code that assumes the high bits of that parameter are already zero (including optimizing away explicit casts) but it will also generate code at the call site to ensure this condition is met (otherwise it will optimize those casts away). When clang is compiling functions, it declares the function parameter types to include the zeroext / signext modifiers (e.g. it assumes that the caller will take care of the conversion). By contrast, when GCC compiles functions, it appears to take the safe approach and has both the caller and callee perform the bit truncation. Therefore, only external functions (those using addGlobalMapping, esp. those in ccall) must be annotated (to match clang's expectations) to ensure proper behavior. Internally to Julia, this parameter is optional and only serves to occasionally allow slight additional optimization. |
…er parameters in clang-compiled code
Following up on on conversation last night, clang only applies signext/zext to i8 and i16 arguments (not pointers, floats, or larger inteters). It's probably still worth asking them what their algorithm is. |
The clang / llvm-gcc compiler that ships with Apple does not cast 8 bit values to the appropriate size in function arguments. I worked around this bug in the fall to correct boxing of 8 bit values and get the code to compile, but I realized this could affect ccall also (if the target code was compiled with llvm-gcc / clang) so I started to put together this testsuite for ccall to demonstrate the error.