[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [cdi-devel] Interrupts



On Mon, Jan 18, 2010 at 7:56 AM, Kevin Wolf <kevin@xxxxxxxxxx> wrote:
>> I would say that pcnet's implementation of its own solution when
>> cdi_wait_irq already exists was fundamentally wrong. There is no point
>> reinventing the wheel in each driver when you already have the
>> functionality available in the interface.
>
> Reinventing the wheel is just ugliness, it doesn't lead to fundamentally
> broken code. But at least you don't seem to be opposed to forbid such
> things in a better specification.

Right, the interface provides things like cdi_wait_irq for a reason. I
should have elaborated further when I stated that I felt the code was
fundamentally wrong. Pedigree for example implements cdi_wait_irq in
such a way that the calling thread can be put to sleep until the IRQ
fires, at which point it can be woken up directly. Tyndur might define
cdi_wait_irq in a different way, with its own unique semantics. This
is why I feel it is fundamentally wrong to write a CDI driver which
implements interface-provided functionality itself.

You are free to disagree, of course. I will not hold that against you,
but I do want to assert how I feel about the issue.

>> > The point is that currently in my kernel CDI drivers generally run with
>> > IRQs turned off as the kernel lacks proper locking, so I'm not sure what
>> > would happen if I would allow IRQs in any place (or even let it be
>> > preempted). So pcnet would wait forever for an IRQ to happen.
>>
>> Perhaps cdi_wait_irq needs a timeout parameter. This way we can also
>> enforce a decent timeframe in which an IRQ should occur, and avoid the
>> case where a driver hangs because an IRQ never arrives.
>
> It already has this parameter.

I should have checked, but I was a little rushed. So if it does have
this parameter, you could have a cdi_wait_irq(2000) in pcnet that
waits 2 seconds for an IRQ rather than hang the system forever for an
interrupt which will never arrive. Maybe cdi_wait_irq could also check
and enable interrupts to be absolutely certain (this is an OS-specific
thing though).

In my opinion, IRQ support should be a requirement for the CDI
interface. The alternative is that drivers have multiple code paths:
one for "IRQs available", the other for "No IRQs available" (and throw
an error when the device can't run without IRQs).

>> > What do you think? Do all other OSes implementing CDI so far allow IRQs
>> > at any time or does it just happen to work? (I think meanwhile all
>> > drivers are well-behaved and don't do it manually)
>>
>> Drivers that use interrupts should be able to assume interrupts are
>> enabled and working properly.
>
> I fully agree that driver should be safe to assume that interrupts work.
> The question is what "working properly" means. Basically this is about
> whether a driver function can be interrupted for a IRQ handler (like for
> a Unix signal handler) or if the IRQs are queued. I strongly suggest the
> latter - in the example of pcnet this would mean that the code was
> completely wrong and could not even have worked (which shows that tyndur
> doesn't implement this semantics yet).

Keep in mind that I come from the monolithic school of thought, where
I can say that the driver is just another function call and therefore
can be interrupted easily. You come from microkernel land, where
drivers are actually running outside the kernel.

> I've though a bit more about it, and in fact I think what we need to
> clearly define is semantics with respect to concurrency and/or
> signal-like interruptions. IDC message handling is another thing of this
> type, should it ever be implemented.

Semantics are always fun to discuss :-)

> If we want to allow concurrency/interruption, we need to define things
> like locks, signal blocking etc. I really don't want to get into that
> business with CDI because I fear trouble with the different types of
> OSes here. So maybe the most reasonable approach is to defined that at
> the same time at most one driver function per device is running and that
> it may not be interrupted as in signals. The library can then define the
> necessary message queues, threading with locks or whatever model it
> likes. The drivers themselves can be kept simple then.

I agree with the concept of only one driver function per device. This
avoids race conditions when accessing device-specific data in the
driver (eg, rx/tx buffers on a NIC) and also helps implementors to
visualise the way CDI is used a little easier.

If IRQs are queued until after a driver function is complete,
cdi_wait_irq may need to actually break from convention here.
cdi_wait_irq is, in my opinion, one of the most important functions in
CDI. Drivers often need to wait for an IRQ to determine if a network
packet got sent correctly, or to find out that a disk sector has been
transferred.

Perhaps cdi_wait_irq could set a flag which allows IRQs to be called?
Something like this:

cdi_wait_irq:
check irq queue, run irq handler and return "success"
set "interruptible" flag
wait for IRQ, run irq handler if one comes in before timeout
unset "interruptible" flag
return timeout or success

This may entail a cdi_os_wait_irq function (or something like that) to
actually perform the wait for an IRQ in order to remove this
complicated CDI trickery from the OS-provided implementation.

I'm interested to hear yours and Max's thoughts.

Cheers,

Matt