You are here

PSOC 4 inline assembly ARM or Thumb? | Cypress Semiconductor

PSOC 4 inline assembly ARM or Thumb?

Summary: 15 Replies, Latest post by paul.chambre_1977636 on 20 Nov 2016 10:28 PM PST
Verified Answers: 1
Last post
Log in to post new comments.
paul.chambre_1977636's picture
User
44 posts

Hi folks,

I'm trying to embed some inline assembly in my project to work around areas where I can't get the compiler and optimizer to do what I want.

I pulled the lst file for the main loop, and I can see a couple of places that I want to tweak, to save a few precious clock cycles. What has me confused is that the lst formatting looks like ARM assembly, but AN89610 (Code Optimization document for PSOC 4) says that PSOC 4 uses Thumb 2.

Is there a way to choose ARM or Thumb?

I modified the main loop portion of the lst code, and swapped it in within asm(" .... "); The result is a "Cannot represent THUMB_OFFSET relocation in this object file" error during build. There's a referenced .s file and line number, but I can't find the actual file. I'm guessing it's just temporary during the build process....? 

At this point, I'm getting almost the performance I need from my solution in C, but I have tried many variants, and only been able to find options that don't work. I do think I need the extra savings I can get from using assembly in the critical sections.

Thanks for the help,

Edit: I think I see the cause of the error: I missed that an array I'm using is represented as a label within the assembly block. I'll need to fix the assembly to correct address the array. I'm still rather confused about the ARM vs. Thumb stuff, though.

 

Paul

paul.chambre_1977636's picture
User
44 posts

OK. Here's what I'm seeing now.

The label I had overlooked (.L61 in my project) is the base address for the addresses of the port status and data registers. Some of these are pre-loaded into registers before the block of code that I'd like to modify, but a couple of them are read "on the fly" (.L61+16 or .L61+20).

Any recommendations on accessing the port registers reliably through inline assembly? I would think even the pre-loaded ones should be something I'm not counting on.

user_365962704's picture
User
224 posts

I think in CortexM (ARMv7) devices you can only go for Thumb instruction set (ARM was used in older versions).

Labels like .L61 seems to be literal pools, and i think using literal pools are the best way to access registers, anyway, you can not load 32bit immediate values in ARM asm, so you have to load it "by pieces", first the lower half and then the higher half, that's what the ldr instruction do under the hood (it´s a macro, not an instruction).

Here are some useful links:

https://community.arm.com/docs/DOC-7869

http://www.ethernut.de/en/documents/arm-inline-asm.html

 

i'm trying to learn inline asm too, remember to use asm volatile (); this way the compiler will respect your inline asm.

Hope it helps

paul.chambre_1977636's picture
User
44 posts

Thanks.

I started with the ethernut cookbook before posting here. It seemed helpful, but is fairly contradictory versus the inline assembly comments in http://www.cypress.com/file/46521/download.

According to the Cypress documentation, something like this:

            "ldr r4, =CYREG_PRT2_PS\n"    //read address high
            "ldr r3, [r4]\n"

Should work, but, in fact, causes a no-information compiler failure. That's the piece I'm trying to figure out at the moment.

paul.chambre_1977636's picture
User
44 posts

Then again, I'm not too sure about the Cypress doc. I tried this example from it (page 15), won't even compile, let alone build:

    int foo = 5L;
    int bar;
 
    bar = foo + 1;
 
    /* bar = foo + 1 */   
    asm("LDR r0, =foo\n"        
        "LDR r1, =bar\n"        
        "LDR r2, [r0]\n"        
        "ADD r2, r2 #1\n"        
        "STR r2, [r1]"); 

Removing the extra(?) r2 in the ADD line allows it to pass the first pass of the compiler, but still fails with the same general error as I was getting from my code.
 

paul.chambre_1977636's picture
User
44 posts

This almost works:

        asm(
            "ldr r3, [%[datawrite]]\n"  //Get address of data write function
            "mov    r2, #125\n"
            "str r2, [r3]\n"  //write data bus from r2
            :
            :   [addhigh] "l" (Pin_Address_High_PS),
                [addlow] "l" (Pin_Address_Low_PS),
                [datawrite] "l" (Pin_Data_DR),
                [dataread] "l" (Pin_Data_PS),
                [pinrw] "l" (CYREG_PRT0_PS),
                [buffer] "l" (buffer)
            );
 

It compiles, and builds, and programs, but it doesn't actually set the 125 value on the data bus. However, the resultant lst output does look pretty similar, so I think I'm close to getting this right.

On the other hand, one difference I observed between what's in the lst files and what will compile is f after the label of a forward jump. The generated assembly in the lst files had this, but it caused a compile failure when I used it in my code.

paul.chambre_1977636's picture
User
44 posts

OK. This is not making much sense. I tried really simplifying the C code, to compare the lst output to the lst output from my really simplified inline assembly. To my eye, the resultant lst content looks logically equivalent, but the C works, and the inline assembly does not. By "works", I mean that the C writes 125 to the register associated with Pins_Data, and the assembly leaves that as all high values (255).

The two results (source and lst) are attached (because of the Cypress spam filter that seems to fire whenever too much example code is embedded in a post).
 

paul.chambre_1977636's picture
User
44 posts

I got the really simple case working: writing a byte to a pin set:

         asm(
            "mov r3, %[datawrite]\n"  //Get address of data write function
            "mov    r2, #125\n"
            "str r2, [r3]\n"  //write data bus from r2
            :
            : [datawrite] "l" (&Pin_Data_DR)
            );

The optimizer still messes it up a bit, but it works. Now just to figure out all the other issues....
 

user_1377889's picture
User
10575 posts

Well, Paul, I can assure you that it is a challenge to be better than the GCC optimizer! Did you try to set (you can do that on a .c file basis)  the optimization level to "speed" or "size"? There are even other settings to try mentioned in the GNU compiler manual.

 

Bob

paul.chambre_1977636's picture
User
44 posts

Yes. It's built using speed optimization. I also had to add some noinline optimizer hints to keep the optimizer from really messing up parts of it.

The inline assembly is to deal with areas where an if/else would be more efficient than the current code, but the optimize won't accept it. The optimizer is also doing things like checking an if condition both at the top and bottom of a block of code,

paul.chambre_1977636's picture
User
44 posts

OK. So, the current piece I'm fighting with is:

1. Variables passed through using symbolic names (or position identifiers) in the input section get mapped automatically with ldr statements into registers the compiler picks

2. I don't see a way to control these ldr commands or predict which registers will be used for which variables

3. My code picks up with the mov commands to put the variable addresses into particular registers

4. The compiler doesn't care which registers I've selected.

The current state of things is that I want to use r3 and r4 for the variable addresses, but the compiler uses r2 and r3 for its ldr commands, and then executes my mov commands in such a way that r2 overwrites r3 before I ever get a chance to use it.

So, the main question I have is, is there a way to know or to symbolically use the registers that the compiler is going to select for its ldrs? 

Edit: If I look at the lst code, and then swap my selected registers to match what the compiler picked for its ldr targets, things work (although there's a pointless mov r3, r3); but this seems like the wrong way to do things, and likely to break easily when any code is changed.

Edit 2: Nevermind. I figured it out. I was using the mov commands because the ARM Thumb2 documentation was pretty clear about needing to first load a variable address into a register before you could do anything with it, but the compiler is actually doing that for me, so the mov commands are not needed. The correct simple example is this:

         asm(
            "mov    r2, #125\n"
            "str r2, [%[datawrite]]\n"  //write data bus from r2
            :
            : [datawrite] "l" (&Pin_Data_DR)
            );

The compiler will pick a register for &Pin_Data_DR, add an ldr into it, and then will substitute that register into the str command.

Log in to post new comments.