Greyhole troubleshooting

From Amahi Wiki
Jump to: navigation, search

NOTE: For Fedora, all Greyhole commands must be executed as root user. Ubuntu requires commands be prefixed with sudo.

Gathering Good Troubleshooting Information when reporting an issue with Greyhole, or asking for help. You should provide the following:

1. What version of OS, Samba & Greyhole are you running?

Fedora:
uname -r; rpm -q samba amahi-greyhole
Ubuntu:
uname -r; dpkg -s samba greyhole

2. The content of the /etc/samba/smb.conf & /etc/greyhole.conf files (provide paste URLs):

Fedora:
yum -y install fpaste; 
fpaste /etc/samba/smb.conf 
fpaste /etc/greyhole.conf
Ubuntu:
pastebinit /etc/samba/smb.conf 
pastebinit /etc/greyhole.conf

3. The result of the following commands:

mount
fdisk -l
df -h
greyhole --stats

4. The list drives in your storage pool (per Amahi platform):

mysql -u root -phda -e "select * from disk_pool_partitions" hda_production

5. A list of the directories on the root of the drives included in your storage pool, obtained with the following command (provide a paste URL):

Fedora
mysql -u root -phda -e "select concat(path, '/gh') from disk_pool_partitions" \
hda_production | grep -v 'concat(' | xargs ls -la | fpaste
Ubuntu
mysql -u root -phda -e "select concat(path, '/gh') from disk_pool_partitions" \
hda_production | grep -v 'concat(' | xargs ls -la | pastebinit 

6. The Greyhole work queue:

greyhole --view-queue

If the above command returns nothing, execute the following to see the problem:

tail /var/log/greyhole.log

7. If you have issues with a particular file (file disappeared, is wrong size, has wrong filename), execute the following to find out what Greyhole did with this file:

greyhole --debug filename

Greyhole Work Queue

I've not seen too many other reports of this occuring on the internet, so I suspect it's something that may be specific to my setup. However, the fix is pretty general, so I'm documenting it here for others to use.

Sometimes, greyhole will have too many tasks queued, and will lock up, leaving you with a full landing zone and an fsck queue that just isn't processing. Always leave your server overnight (or longer, depending on the size of your landing zone (overnight should be sufficient for most) to ensure that its not just taking a while to process, but if its really, definitely stuck, you may want to follow these instructions to clear the queue.

Stop the greyhole daemon

Fedora 19
systemctl stop amahi-greyhole.service
Ubuntu
service greyhole stop

Connect to the MYSQL engine

mysql -u greyhole -pgreyhole

Select the Greyhole database

USE greyhole;

Drop and Recreate the 'Tasks' table with the same structure

TRUNCATE TABLE tasks;

Exit the MYSQL engine

exit;

Restart the greyhole daemon

Fedora 19
systemctl start amahi-greyhole.service
Ubuntu
service greyhole start

Schedule an fsck to clear out the files currently in the landing zone

greyhole --fsck

Queue Still Stuck

If you've done the above, and all you've accomplished is another stuck fsck, it's time to get serious. The only solution I've found is to blow away the landing zone and then schedule the fsck with an empty landingzone. Greyhole will rebuild your file links from the files in your storage drives, so you won't lose anything except the data that was waiting to be moved to your storage drives. The steps below will mitigate even that loss.

Copy the contents of your landing zone somewhere else (preferably a non-greyhole share, so you can sort it out from the comfort and convenience of your normal workstation). In the example below, I just borrow the AFP share I have locally mounted for Apple timemachine backups, but this can be any folder. Don't copy the /drives subfolder, as that's actually the physical drives attached to the system.

cp -rv /var/hda/files/shareName /mnt/timemachine/landingzonebackup/shareName

Now, stop the greyhole service again. Do this from the Amahi Console->Settings->Servers->Greyhole. Disable the watchdog and press the big red 'stop' button.

Blow away the contents of your landing zone shares. It is very important you don't touch the /drives folder within /var/hda/files if you have a default setup.

rm -rf /var/hda/files/shareName/*

Now, clean up the horrible mess you'll probably have in each drive in /var/hda/files/drives - I had old shares and other assorted crap that was slowing down each fsck. Your /var/hda/files/drives/driveX/gh folder (replace X with whatever drive number you have) should only have references for your current shares. Use the code below to remove any folders that contain old shares.

rm -rf /var/hda/files/drives/drive/gh/folderName

Then clean up the even more horrible mess you probably have in your /var/hda/files/drives/driveX/gh/.gh_graveyard folder. Note, this is a hidden folder, it won't be listed if you just ls the /var/hda/files/drives/driveX/gh folder. Use the command below to remove each folder that isn't a current share.

rm -rf /var/hda/files/drives/drive/gh/.gh_graveyard/folderName

Now, restart the greyhole service using the Amahi Console. Remember to turn the watchdog back on.

Schedule another --fsck. It should start pretty much right away, and begin rebuilding the landing zone. Note that any shares you have won't be available again until the fsck completes. greyhole --fsck

Optional: You'll probably now want to check the landing zone backup you made, and recopy any actual data in the folder back in (don't just copy the whole folder back in, that would make everything you just did pointless).

Optional: Cleaning all the crap out of your physical drives has probably freed up a bit of space (or quite a bit, depending how much crap you have). Now is probably a good time to balance out your storage again. greyhole --balance

If you run into any trouble with the above, post it in the Amahi forums and PM me (U=doczombie) the link to your forum post if you like (PM's without a link to a forum post are liable to be ignored).

Database Table Error

Can't describe tasks with query: DESCRIBE tasks - Error: Table 'greyhole.tasks' doesn't exist

If the Greyhole service will not start, or you see this message when gathering info for support, it is possible that the Greyhole database was not created.

The following command will create the database:

hda-create-db-and-user greyhole
mysql -ugreyhole -pgreyhole greyhole < /usr/share/greyhole/schema-mysql.sql

Restart the greyhole daemon

Fedora 19
systemctl start amahi-greyhole.service
Ubuntu
service greyhole start

Drive is OFFLINE

Check that the /gh folder is created on the drive.

Out Of iNodes

If you run out of iNodes due to the size of the Greyhole Spool directory, the following should fix the problem:

mv /var/spool/greyhole /var/spool/greyhole.bak
mkdir -p /var/spool/greyhole
chmod 777 /var/spool/greyhole
/usr/bin/greyhole --create-mem-spool
systemctl restart amahi-greyhole.service
greyhole -f

Reference: Greyhole Issue 123