Netboot Mailing List (by thread)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Etherboot 4.5.5 released




First a general note on the feature request by Hans-Peter Jansen.  I don't
know what one would gain if etherboot would reboot after (say) 3 minutes without
dhcp replies.  It would just restart from the beginning.  And it would piss
me off, because if someone in the department (which usually is me, but
might be someone else) decides to reboot our server, it would cause those
suggested etherboot reboots to be triggered if someone decides to boot one of
the diskless machines right at that time.  The server needs at least 5 minutes
for a clean reboot, not to mention the estimated crash reboot time of 40
minutes.  It will irritate the users needlessly and they will end up in my
office with wild guesses ranging from power surges, broken power supplies or
even thermal problems of the mainboard and/or the CPU.  No thanks.  What's
wrong with the current exponential backoff behavior?  If we assume the server is
unavailable for 10 minutes, it just means that on average the backoff counter
is at 7, so it will take up to 17 minutes (on average only half of that) to the
next try.  This is a bit long, but the users can force an instant retry if
they press ESC, as Ken said.  That's what the bush drum generally takes care of -
the users will find out very quickly that the server is back online.

But while looking at it I spotted fishy code in etherboot: if you look at the
code in main.c: load(), there is an unhandled case when bootp() fails and
EMERGENCYDISKBOOT is not set - it just pretends that it found something...
bootp() fails after 20 retries.  Not that this happens frequently - it takes
on average 1165 hours (48.5 days) to get to that point, but it might happen.
Not that this is in any way related to the feature request except that after
a while it really takes ages to the next retry.  Maybe the RFC951 sleep should
be limited to an exponent of 10 (which would give approx. 34 minutes retry
interval, which shouldn't be a strain on the network even for a large network
of machine desperately waiting for the dhcp/bootp server to answer their
request).  Anyone volunteers?

Back to the other issues.

Doug Ambrisko wrote:
> I'll see if I can dig any up.  I finally got it to work under VMware
> after turning off some options.  Also I had trouble with my DHCP reply
> being to long.  I'll look into this later.  I ran into a couple
> glitches compiling on FreeBSD and I have included the simply patches at
> the end.

The DHCP reply length seems to be a quite common problem.  I haven't been
plagued by it for quite a while, but I doubt that it is related to the
changes in the 4.5.5 version.  Usually it is caused by an upgrade to the
dhcpd, which changes the order in which options are included (happened to
me when I upgraded to a newer version of ISC dhcpd).

BTW: you should have two log entries in the VMware log per reboot - something
about RX/TX non-busmaster transfers.  Do you have any idea what these mean?
I browsed through the AMD datasheets and didn't find anything they suggest to do
in a substantially different way to the etherboot driver.  But I might be wrong -
I read far too many datasheets which mentioned the most important things in some
unsuspiciously looking clause in a sentence that was mostly useless bullshit.

> I also had to use a newer gas & ld, whereas before we just used a newer
> gas.  So I added:
>   AS=          `echo ../../bin*/gas/as-new`
>   LD32=          `echo ../../bin*/ld/ld-new`
> in the Makefile file for the FreeBSD "port".

What was the problem with the old ld?  Maybe it isn't really obligatory.
I cannot easily test any binutils dependency, as the oldest version I have
available is a pretty recent 2.9.1 variant shipped with a Linux distro and
I regularly use some Feb 2000 snapshot of 2.9.5 because I need some of the
new features for my real work.

> I ran into a compiler issue with gcc 2.95.2 ie from gcc -v
>         770z% gcc -v
>         Using builtin specs.
>         gcc version 2.95.2 19991024 (release)
>         770z%
> 
> It reported problems with the AX & BX registers.

Doug: can you give me the precise error message?  It would help to find the
root of the problem.  Currently I could only test with egcs-1.1.2.  BTW: compiling
the Multiboot ELF support in if you just want to boot FreeBSD is waste of ROM
space.  But what I'm complainig about - you found a problem, that's most important.

I haven't yet seen a single popular OSs to be converted to the Multiboot spec,
except for someone's private entertainment.  Everyone seems to like their
OS's loader regardless of the arbitrary restrictions it poses on the startup
environment.  Only a minority switches to grub, netboot, etherboot, ...

> So this is a quick once over and I was able to get it to build under FreeBSD
> with my "port" that uses a newer gas & ld and it seemed to work close
> to before.

At least that's good to hear.

Your patch to the Makefile was not quite correct, though: the variable you added
_was_ missing, but should be a recompiled version.  A manually corrected patch
is appended (hopefully the tab chars don't get lost again...).  Ken, could you
apply this one, please?  Seems like I never actually built a floppy when I
recompiled the loaders.  If you already have the first patch in, just do it by
hand...

Ken: Noticed the target "precompiled" in the Makefile?  This should make your
release preparation work a bit easier.  Downside: the bug Doug reported with
the Makefile.

Doug: I'm really glad to see the second patch to linux-asm-string.h.  So the last
"random" #ifdef __FreeBSD__ has been killed.  Only the one in osdep.h remains...
Great.

Maybe the #if defined(__linux__) || defined(__FreeBSD__) could be replaced
by a #ifdef __GNUC__.  I don't see any reason why any of Etherboot is Linux or
FreeBSD specific.  If you have the right tools you could build it under Windows
or whatever.

*** Makefile.orig       Sat Mar 18 00:30:20 2000
--- Makefile    Sat Mar 18 19:46:12 2000
***************
*** 166,171 ****
--- 169,175 ----
  PRLOADER=	bin/prloader.bin
  RZLOADER=	bin/rzloader.bin
  PRZLOADER=	bin/przloader.bin
+ FLOPPYLOAD=	bin/floppyload.bin
  COMLOAD=	bin/comload.bin
  endif

Cheers
===========================================================================
This Mail was sent to netboot mailing list by:
Klaus Espenlaub <espenlaub@informatik.uni-ulm.de>
To get help about this list, send a mail with 'help' as the only string in
it's body to majordomo@baghira.han.de. If you have problems with this list,
send a mail to netboot-owner@baghira.han.de.



For requests or suggestions regarding this mailing list archive please write to netboot@gkminix.han.de.