SOLARIS X86 HOME     SEARCH     PACKAGES     FAQ  
FAQ: INTRO RESOURCES PRE-INSTALLATION INSTALLATION CUSTOMIZATION TROUBLESHOOTING X WINDOWS INTEROPERABILITY

[Solaris x86 FAQ] 7. Troubleshooting (Solaris x86 FAQ)



(7.0) TROUBLESHOOTING

(7.1) What can I do if Solaris won't boot?

You need to boot from your install CD. Insert the Solaris Software CD in your CDROM drive. If your CDROM drive/BIOS isn't bootable, first insert the "Device Configuration Assistant" (DCA) diskette. At the "Boot Solaris" menu, choose "CD."

At the "Type of Installation: Interactive or JumpStart" menu, type "b -s"

Or, after the video configuration, network, time and date you'll notice one of the menu's has a button: [Exit] Select Exit and, when it asks you again "do you want to exit?," just say yes.

Once you're at the UNIX root prompt #, you can mount the boot drive with "mount /dev/dsk/c0t0d0s0 /mnt"" and view anything wrong with the boot drive (omit the "t0" for ATAPI).

[Modified from Bob Palowoda's Solaris 2.4 x86 FAQ]


(7.2) How do I restore the Solaris boot block without reinstalling?

This may happen when installing a boot manager that comes with another operating system (such as LILO from Linux) or an after-market multi-OS boot manager. These sometimes trample's active partition, which in our case is Solaris. Also, moving the Solaris partition with a partition manager program such as Partition Magic requires reinstalling the Solaris boot block. Before taking these steps, first verify the Solaris partition is active. If it isn't, just make the Solaris partition active and reboot. Otherwise follow the steps below.

1. Boot from CD-ROM and get the root prompt, #, as described in the previous question, 7.1.

2. Determine the controller, disk number, and partition. The boot disk is /dev/rdsk/c?t?d?p? where ? is the controller #, target ID, and disk #, and partition #. Omit "t?" for ATAPI E.g., /dev/rdsk/c0d0p0

3. Verify it's the correct device correct with prtvtoc for the drive: This is VERY important; if it's wrong, you you may hose another partition: prtvtoc /dev/rdsk/c0t0d0p0 (omit "t0" for ATAPI, always use p0, which means the "entire drive"). The prtvtoc prints out the map for the Solaris partition on the hard drive, if found. The partitions shown on the output are actually "slices" within the Solaris partition.

4. Restore the boot block as follows:

   /sbin/fdisk -b /usr/lib/fs/ufs/mboot (raw disk dev)
E.g., for SCSI it might be:
   /sbin/fdisk -b /usr/lib/fs/ufs/mboot /dev/rdsk/c0t0d0p0
(omit "t0" for ATAPI)

5. Finally, remove your CDROM and diskette media and type "/sbin/shutdown -i6" to reboot. The Solaris Multiple Device Boot Menu should appear after rebooting. If not, you can always to an upgrade (re-)install.

Note: This procedure does NOT make your Solaris partition active again (sometimes needed after installing another operating system, such as Windows, on the same disk), it just writes to your bootblock IN your Solaris partition. To learn more about the Solaris boot process, read the boot(1M) man page.


(7.3) What can I do during the Solaris/x86 booting sequence?

Starting with Solaris 10 06/06, Solaris now uses GRUB to boot, making the answer below obsolete. That means you can boot to multiple partitions, with multiple instances of Solaris and other operating systems (such as Linux and Windows)

Step #1: Boot loader

If you have multiple partitions, the boot loader in the Solaris partition will come up and ask you which partition you want to boot. This partition must be the active partition, or at least be marked active by a third-party boot manager before this boot loader receives control (not all boot managers have this feature). If you don't answer in so many seconds, it boots Solaris.

This boot manager is pretty basic. It has no customization. You can't change the default boot partition to one other than Solaris, you can't change the timeout value, and you can't change the partition descriptions. But it gets the job done.

Step #2: Device Configuration Assistant (DCA)

This will ask you to press ESC if you want to change stuff. This is to make up for the fact that x86 machines don't have a nice OpenBOOT chip to sort out REAL "Plug and Play".

Basically, in Solaris x86, the Device Assistant seems to set up certain things in /boot/solaris. This is so the "real" OS has some common format to examine for devices, instead of having lots of nasty x86 hardware specific stuff. That way, Sun can keep the main OS somewhat hardware independent, and keep it very close to the Sparc version.

The "Assistant" can actually been of assistance. If you select "partial scan", then "Device tasks", and then "View/Edit Devices", it will tell you what Solaris THINKS your devices are, and where they are at. Quite useful, when Solaris gets completely lost, and you're wondering if it's your fault, or what.

Otherwise, it can give you a warm fuzzy feeling, if you select "Full Scan", and you see all your devices properly recognized.

Step #3: OS Boot

Well, actually, the "Boot Assistant". The interface is similar, but not identical, to SPARC Solaris' OpenBoot 'boot' command. The main differences I notice are:

Step #4: The Main OS: Solaris

You made it (I hope)!. Hopefully, you should now see a line with "SunOS5.8" or similar in it, and a little twirly text character spinner starting. You are now really in the classic Solaris environment. From here on in, your experience is almost identical to your brethren who work with SPARC Sun equipment.

To learn more about the the Solaris boot process, read the boot(1M) man page.

[Thanks to Phil at http://www.bolthole.com/solaris/]


(7.4) How do I logon as root if the password doesn't work anymore?

Regaining control of a Solaris x86 system where the root password has been lost can be accomplished by the following steps. Note that any savvy user can do this with the proper CD-ROM and diskette. Therefore, of course, physical security of a system is important for machines containing sensitive data.

  1. Insert installation boot diskette and installation CD-ROM for Solaris x86.
  2. Boot system from the installation floppy and select the CD-ROM as the boot device.
  3. Type "b -s" (instead of typing 1 or 2 from the menu) and it'll drop you straight to a root shell, #, (and you'll be in single-user mode).
  4. At the root prompt, #, key in the following commands, which will create a directory called hdrive under the /tmp directory and then mount the root hard drive partition under this temporary directory.
          mkdir  /tmp/hdrive
          mount  /dev/dsk/c0t0d0s0  /tmp/hdrive #SCSI; for ATAPI, omit "t0"
    
  5. To use the vi editor, the TERM variable must be defined. Key in the following commands.
          TERM=at386
          export TERM
    
  6. Start vi (or some other editor) and load /tmp/hdrive/etc/shadow file:
          vi /tmp/hdrive/etc/shadow
    
  7. Change the first line of the shadow file that has the root entry to:
          root::6445::::::
    
  8. Write and quit the vi editor with the "!" override command:
          :wq!
    
  9. Remove the floppy installation diskette, and reboot the system:
          /sbin/shutdown -i6
    
  10. When system has rebooted from the hard drive, you can now log in from the Console Login: as root with no password. Just hit enter for the password.
  11. After logging in as root, use the passwd command to change the root password and secure the system.

Andreas Pfaffeneder has a simpler suggested to recover the password:
Choose the Failsafe-Boot option (which results in kernel/unix -s), answer "Yes" when you are prompted if / of the installed system should be mounted. Chroot into the system and change the password:

# chroot /a /bin/bash
# passwd
# /sbin/shutdown -i6

[Thanks to Lynn R. Francis of Texas State Technical College and Andreas Pfaffeneder]


(7.5) My licensed software fails because the host ID is 0. What's wrong?

Intel processor machines don't have an IDPROM, so Sun generates a serial number, hostid command or sysinfo()'s SI_HW_SERIAL, pseudo-randomly during installation. The number is stored in /kernel/misc/sysinit, whose only function, it appears, is to provide the serial number. If serialization information is tampered or sysinit fails to load, the host ID will be 0. If you reinstall Solaris, sysinit will be regenerated and your host ID will change. So be careful about reinstalling Solaris if you have licensed software that depends on your host ID. Backup your sysinit file.

To preserve the same ID (and therefore licenses), copy file /kernel/misc/sysinit to the replacement system. I understand the Sun Workshop/Sun ONE compiler manual says this is allowed twice per calendar year (please verify this yourself).

For more information, see the Sun NVRAM/hostid FAQ, available at http://www.squirrel.com/squirrel/sun-nvram-hostid.faq.html and elsewhere. This also has tools to fake hostids.


(7.6) How can I fix Netscape Communicator to render fonts correctly on S/x86?

This problem occurs with Solaris 2.6 and Netscape Communicator 4.0x, and has since been fixed. Apply patch 106248, which I'm told fixes this problem. A workaround is to add the following two lines to your ~/.xinitrc file:

       xset +fp /usr/openwin/lib/X11/fonts/75dpi/
       xset fp rehash

Another workaround, if you don't have these fonts, is to go into Netscape Preferences and change the font faces.

[Thank's to Alan Orndorff, Jeffrey Cook, and John Riddoch]


(7.7) Why doesn't Netscape run as root?

This is a bug in Netscape. Due to a Netscape 4.x bug (it thrashes the $HOME environment variable) the X11 library cannot find root's .Xauthority file in the root dir unless your current directory is /.

Large, complex programs (especially those taking input from & to the Internet) should not be run as root. Experienced users and Administrators run as root only for essential sysadmin tasks.

If you must run as root, try one of these tricks:

[Thanks to Jürgen Keil via John Groenveld]


(7.8) I moved my PCI host adapter to another slot and the system won't boot!

Don't move the adapter. It isn't a supported feature in Solaris and isn't easy to recover from. If you have any choice in the matter, move the controller back to it original slot.

The PCI device number is part of the device's basic ID, including its child disks. If you change slots, you've effectively removed that controller and its disks, and added an unrelated controller and disks. You need to fix up all of the references to the old disks to point to the new disks.

I've never come up with any strategy better than "boot, observe failure, fix failure, reboot" for recovering from this kind of change. For simple cases (single controller, in particular) it can be helpful to clear /dev/dsk/* and /dev/rdsk/* and run "disks", but that is perilous too.

Incidentally, changing motherboards is likely to trip exactly this problem, because motherboards generally number their slots differently.

To conclude, it's difficult and dangerous, and the general guidelines involves fixing:

  1. /etc/vfstab or /dev or both
  2. /devices to match one another
  3. possibly removing lines from /etc/path_to_inst in order to make the right /devices nodes show up

The ultimate goal is to get back the same controller numbers as before.

[Sun FAQ 2576-02 at http://access1.Sun.COM/cgi-bin/rinfo2html?257602.faq]


(7.9) Why is Solaris always booting into the Device Configuration Assistant (DCA)?

This is usually caused by one of the following:

To change or set your default boot device, See Sun FAQ 2271-02 at http://access1.Sun.COM/cgi-bin/rinfo2html?227102.faq for instructions. To summarize:


(7.10) What is the equivalent of STOP-A for Solaris x86?

>I don't think so, because Stop-A allow you to go into open boot prom of >the SUN and on a x86 it's a different thing (BIOS) Unlike Solaris on Sparc (where STOP-A gets you the OpenBoot prompt), there is no PROM firmware to drop into on x86. You can boot your system under kadb and then use a similar keystroke to drop into kadb and obtain debugging information. To boot under kadb, type eeprom boot-file=kadb and then:

You can then type, for example:
$<systemdump
to force your system to panic and generate a crash dump (the equivalent of "sync" at the ok prompt on SPARC).

The Device Configuration Assistant (DCA) portion of the Intel boot process can be interrupted by hitting escape (when prompted). This (I feel) is the Intel version of the Boot Prom Monitor. Of course, all the commands cannot be equated apples to apples because of the hardware differences!

If your console is a terminal, you can type "shift-break" or "ctrl-break" or "ctrl-\" (ctrl-backslash) or "<enter>" followed by "~" and "ctrl-break" on Solaris Sparc, but this, too, is not available for Solaris x86.

With Solaris 8 SPARC (but not Intel), there's a new feature to allow keyboard sequences to generate a break (bug 4147705). The 3-character sequence is <RETURN>, ~ (tilde), ^b. Each character must be entered between 0.5 to 2 seconds. This is enabled with the "kbd -a alternate" command.

Similarly a soft reset is <RETURN>, ~ (tilde), cntl-shift-R, XIR is <RETURN>, ~ (tilde), cntl-shift-X, and Power Cycle is <RETURN>, ~ (tilde), cntl-shift-P, I believe these commands are also available only on SPARC.

[Thanks to Ramit Luthra and Mike Shapiro]


(7.11) How can I reboot Solaris x86 without it asking me to to "press a key" before rebooting?

This works for me: become root and type "shutdown -i6 -g0 -y". Or: "/sbin/shutdown -i6 This is most useful when the system is remote with no console keyboard access. don't use reboot, halt, or poweroff as they bypass the shutdown scripts.

[Thanks to Charles J. Fisher]


(7.12) Help! I'm stuck in the "Boot Assistant" and can't boot. What do I do?

If you get a message similar to: Run Error: File not found. could not run s You probably typed "reboot -- -s" or "reboot -- -r" or similar. This works for Solaris SPARC, but not for Solaris on Intel--it's disastrous. It changes your "boot-file" eeprom variable to "-s", which errors out and puts you in an endless loop in B oot Assistant.

To undo this, type the following at the Boot Assistant prompt: "b kernel/unix" This boots with file /platform/i86/kernel/unix. If this doesn't help, your filesystem may be hosed. In that case, you have to reinstall. But make sure this is the case first.


(7.13) Help! I get error 2 or error 8 while applying patches. What do I do?

Don't do anything. Error 2 means you already have the same or newer code. Error 8 means you can't patch some optional packages that haven't been installed, even if you did "everything plus OEM" during the original installation. Other errors, usually from lack of disk space, are explained in the patchadd(1M) man page.

[Thanks to Paul Karagianis]


(7.14) How do I prevent kdmconfig from running on boot up when I know my keyboard, display, and mouse configuration has not changed?

Mike of Sun has this response (9/2002):

I recognized this as a bug that was fixed a while back, for one instance with older ATI cards. I mentioned it to the video developer that fixed the ATI bug and he mentioned that there is a workaround if you see this:

This problem occurs with certain hardware (keyboards, mice, video devices). During booting, a checksum is calculated based on some info obtained for each device. The checksum is compared to a checksum recorded in the OWconfig file. If the checksums don't match, kdmconfig thinks the device may have changed, and asks the user to check it.

On systems that exhibit this problem, the device info that is checksummed seems to change from boot to boot even though no hardware has changed. I've seen this happen with some old ATI video cards and some keyboards.

The easy workaround for the problem is to run kdmconfig and test and accept the desired configuration by clicking on the "Yes" button of the test display. Then edit the last line of the OWconfig file in /etc/openwin/server/etc. Change the "1" to "2", so that is says: TestedByUser="2"; This will cause kdmconfig to ignore checksum differences.

If you are upgrading from Solaris 8 or older to Solaris 9, check the ddxHandler line. It should say "ddxSUNWx86mouse.so.1", not "ddxSUNWmouse.so.1". Otherwise, X Windows won't start (no graphics).

[Thanks to Mike Riley]


(7.15) * I get this error message: "can't get local host's domain name" or "The local host's domain name hasn't been set." What do I do?

This is a NIS message. The easiest way to fix it is to type the following as root: Solaris 11:
svccfg -s nis/domain setprop \ config/domainname= hostname: abc.com svccfg -s nis/domain:default refresh svcadm enable nis/domain

Solaris 10:
domainname abc.com; domainname >/etc/defaultdomain

(replace abc.com with your NIS domain name, which is sometimes the same as the DNS domain name).


(7.16) My system doesn't boot due to superblock problems with the root filesystem. What do I do?

Normally, you reboot in single user mode and run /usr/bin/fsck as root and everything is OK. If you get a message about errors/problems on /dev/dsk/c0d0s0, are told to run fsck manually in single user mode, and get this message:
BAD SUPER BLOCK: BAD VALUES IN SUPERBLOCK USE AN ALTERNATIVE SUPERBLOCK to SUPPLY NEEDED INFORMATION e.g. fsck -F ufs -b=# [special].
then you may be able to recover from this if the disk isn't entirely corrupted. The superblock stores important information about the file system. Because it is so important it is duplicated in several places. Hopefully one of the backup superblocks isn't corrupted. To see duplicate locations of superblock, use newfs -Nv. For example, if your root slice is at /dev/dsk/c0d0s0, run this command:

# newfs -Nv /dev/dsk/c0d0s0 You must specify -Nv so you don't clobber your root slice with a new filesystem. Your output should look like this:

# newfs -Nv /dev/dsk/c0d0s0
mkfs -F ufs -o N /dev/rdsk/c0d0s0 614880 63 16 8192 1024 16 10 60 2048 t
0 -1 8
7 n
/dev/rdsk/c0d0s0:       614880 sectors in 610 cylinders of 16 tracks, 63
sectors
        300.2MB in 39 cyl groups (16 c/g, 7.88MB/g, 3776 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
 32, 16224, 32416, 48608, 64800, 80992, 97184, 113376, 129568, 145760,
 468576, 484768, 500960, 516128, 532320, 548512, 564704, 580896, 597088,
 613280,
Note the numbers following "super-block backups." Use one of the numbers in fsck (e.g., 32) and use it with the fsck -F -o b= option:
# fsck -F ufs -o b=32
You may get a message FREE BLK COUNT(S) WRONG IN SUPERBLOCK SALVAGE? or FILE SYSTEM STATE IN SUPERBLOCK IS WRONG; FIX? In either case, type "yes" and press return. You should get a FILE SYSTEM WAS MODIFIED message. Reboot your system. If system complains about shutdown not being found do a halt -q. Now, hopefully, your system will boot up with out any problems.

[Thanks to Kevin Smith.]


(7.17) My system doesn't boot because the boot archive is corrupt. What do I do?

The boot archive contains the kernel modules and configuration files that are required to boot your machine. Boot up in Solaris failsafe mode. Your Solaris image should be mounted on /a. Type the following:
rm -f /a/platform/i86pc/boot_archive; bootadm update-archive -R /a

[Thanks to Pradhap Devarajan.]


http://sun.drydog.com/faq/


<- PREVIOUS NEXT ->

  SOLARIS X86 HOME     SEARCH     PACKAGES     FAQ  

I Boot OpenSolaris [Blue Ribbon]   This web page is not associated with Oracle Corporation. [Legal Stuff]

 

If you have questions or comments, please send a message to Dan Anderson.

http://www.sun.drydog.com/faq/7.html