Add a couple of sanity checks#2
Conversation
Clarify connection instructions
tchebb
left a comment
There was a problem hiding this comment.
Thanks, and sorry for the slow review! I assume the unhandled TW_BUS_ERROR state is what was causing your issue, and I'd love to fix that. Some of the other changes seem unneeded, though. Take a look at my individual comments, which all reference the ATmega48P/V/88P/V/168P/V Datasheet, chapter 22.
| uint8_t offset; | ||
|
|
||
| switch (TWSR) { | ||
| switch (TW_STATUS) { |
There was a problem hiding this comment.
Good catch, TW_STATUS is definitely more correct here. The datasheet says we need to mask out the prescaler bits, and I hadn't done that.
|
|
||
| // Acknowledge interrupt to let hardware continue. | ||
| TWCR |= _BV(TWINT); | ||
| TWCR = _BV(TWEN) | _BV(TWIE) | _BV(TWINT) | _BV(TWEA); |
There was a problem hiding this comment.
What's the purpose of this change? I know that, in general, read-modify-write of registers can have unintended side effects, but I don't think that's the case for TWCR. The only bits that might potentially cause the hardware to do things when rewritten are TWSTA and TWSTO, but the only place we use them is in the new case above, which should have an early return anyway. Unless I'm missing something, I'd prefer to keep the old line for clarity.
There was a problem hiding this comment.
What's the purpose of this change? I know that, in general, read-modify-write of registers can have unintended side effects, but I don't think that's the case for
TWCR. The only bits that might potentially cause the hardware to do things when rewritten areTWSTAandTWSTO, but the only place we use them is in the new case above, which should have an early return anyway. Unless I'm missing something, I'd prefer to keep the old line for clarity.
THIS is what was causing the issue. Read-modify-write is a big no-no here because the read bits are volatile and not meant to be written back.
There was a problem hiding this comment.
Interesting, I stand corrected. Do you know which specific bit is the issue? Out of the R/W bits in the register, the only ones your version doesn't write are TWSTA and TWSTO. The hardware never sets those according to the datasheet, and they're not write only, so they should read as zero in my version and result in the exact same write you have.
Also, this ISR has to successfully fire many times (once for each byte) for even a single boot to work. So I'm curious what the error state is and how we enter it.
There was a problem hiding this comment.
Interesting, I stand corrected. Do you know which specific bit is the issue?
All three of the other ones I'm writing are needed. I tested.
Also, this ISR has to successfully fire many times (once for each byte) for even a single boot to work. So I'm curious what the error state is and how we enter it.
When the i2c read wraps around the end of the buffer, the state reported by the bits changes to something other than what we want.
There was a problem hiding this comment.
All three of the other ones I'm writing are needed. I tested.
Right, all three of those are for sure needed. I'm asking which extra bits were written in my implementation and broke things. Or, conversely, if any of those three were somehow cleared by hardware, which would also break things. My reading of the datasheet is that, for this register, read-modify-write is equivalent to what you've written. But if this fixed your issue, I must be wrong. I'll do some more investigation on my own and see if I can repro the issue.
There was a problem hiding this comment.
Huh, so I guess that means that they all get set to zero by hardware in the error state. I'm very curious what it is now. By "wraps around the end of the buffer", do you mean when this check starts hitting the second case and setting TWDR to 0?
amlogic-hdmiboot-avr/i2c-flash.c
Line 34 in d6a54c2
There was a problem hiding this comment.
I mean when offset starts being less than sizeof again after having been greater than or equal, yes.
There was a problem hiding this comment.
Also, IIRC, not all of them are cleared each time, but sometimes they are. Either way, the library code makes it plain as day that preserving the values of those bits is not the thing to be done here.
There was a problem hiding this comment.
There are often multiple correct ways to program hardware. I wouldn't necessarily take the Arduino code as gospel with regards to AVR: it's gotten better over the years, but it's often had some pretty suboptimal stuff. For example, the entire reason this repo exists is because the Arduino I2C library can't manage to respond to interrupts fast enough to avoid clock stretching.
That's not to say it's wrong in this case, but it is why I want to investigate further before merging this.
There was a problem hiding this comment.
For example, the entire reason this repo exists is because the Arduino I2C library can't manage to respond to interrupts fast enough to avoid clock stretching.
I would attribute that to the significantly higher overhead the library has.
| // Turn on TWI in slave mode, enable interrupt. | ||
| sei(); | ||
| TWSR = TW_NO_INFO & 1; | ||
| TWBR = 18; |
There was a problem hiding this comment.
Neither of these lines should be needed, since they're both for clock setup but we don't generate an I2C clock in slave mode. Did they make a difference for you?
There was a problem hiding this comment.
Neither of these lines should be needed, since they're both for clock setup but we don't generate an I2C clock in slave mode. Did they make a difference for you?
That might be true if you're using a real Arduino, but I'm using an Adafruit Boarduino, which doesn't have a crystal and is therefore more prone to drift without this.
There was a problem hiding this comment.
It really shouldn't matter regardless. I2C is timed using pulses on the SCL line, which is driven by the master. As a slave device, the AVR isn't timing anything at all. See 22.5.2 in the datasheet on the Bit Rate Generator Unit, which is what both these writes configure:
This unit controls the period of SCL when operating in a Master mode. … Slave operation does not depend on Bit Rate or Prescaler settings, but the CPU clock frequency in the Slave must be at least 16 times higher than the SCL frequency.
The only thing these lines change are the "Bit Rate or Prescaler settings".
There was a problem hiding this comment.
IDK, I tried it both ways and its behavior just seemed more stable this way. It could be mere coincidence, but it can't hurt anything.
| apply power to the device. It should enter BootROM USB DFU mode with no | ||
| additional interaction needed. | ||
| on the ATmega48/88/168/328), attach HDMI pin 17 to GND, and shunt HDMI pins 18 | ||
| and 19 together, then connect that to your target device and apply power to it. |
There was a problem hiding this comment.
These are hotplug detect, right? Are they needed on every device? I don't remember having to do anything specific with them.
There was a problem hiding this comment.
The SoC uses hotplug detect as a sanity check to make sure a dongle is actually plugged in. It doesn't attempt i2c if there is nothing there.
There was a problem hiding this comment.
Which SoC? That might be new behavior, as I developed this with reference to an S905D3 BootROM dump, and that seems to poll I2C unconditionally without checking any GPIO first. I'll double check on real hardware, though, since maybe there's a check hidden somewhere and I got lucky by leaving it floating.
This is a good thing to mention regardless, but I'd like to be specific about whether it's needed for all chips or only for some.
There was a problem hiding this comment.
The one I was messing with is S905W4, so that one at the very least. And no, the W4 part is not a typo.
| TWCR = _BV(TWEN) | _BV(TWIE) | _BV(TWEA) | _BV(TWINT) | _BV(TWSTO); | ||
| while (TWCR & _BV(TWSTO)) | ||
| continue; | ||
| case TW_NO_INFO: |
There was a problem hiding this comment.
The datasheet says that we should never observe this value after an interrupt. If we do, it's still probably more correct to clear the interrupt instead of returning immediately. So I'd remove this case.
There was a problem hiding this comment.
It may say that, but it did not work correctly until I copied this here from the Arduino IDE's library.
| switch (TWSR) { | ||
| switch (TW_STATUS) { | ||
| case TW_BUS_ERROR: | ||
| TWCR = _BV(TWEN) | _BV(TWIE) | _BV(TWEA) | _BV(TWINT) | _BV(TWSTO); |
There was a problem hiding this comment.
I think this case should be written as TWCR |= _BV(TWSTO) | _BV(TWINT) followed by an early return. There's no point spinning on the TWSTO clear when we could just sleep and wait for the next interrupt instead. And the early return stops us from clearing the interrupt twice, which is probably harmless but isn't what the datasheet instructs.
There was a problem hiding this comment.
I think this case should be written as
TWCR |= _BV(TWSTO) | _BV(TWINT)followed by an early return. There's no point spinning on theTWSTOclear when we could just sleep and wait for the next interrupt instead. And the early return stops us from clearing the interrupt twice, which is probably harmless but isn't what the datasheet instructs.
Again: Read-modify-write == bad juju
Nope. This part was just bikeshedding for completeness. It never entered this state. What was causing the issue is below. |
|
Gonna hold off on merging until I can investigate more myself. There's clearly something I'm misunderstanding about the AVR's behavior, and I don't want to merge code I don't understand. Thanks for all the info, that should help me repro. |
After the Arduino responds over I2C, its TWI interface was being left in some kind of undefined state, so if the target device were to lose power momentarily while the Arduino is still powered, it wouldn't respond a second time, and the target device would attempt a full boot. I borrowed a bit of code from Wire.cpp in the Arduino IDE that seems to clean this up such that the Arduino can respond again and again most of the time if need be.