Difference between revisions of "Infrastructure Virtualization Project"

From Amahi Wiki
Jump to: navigation, search
Line 121: Line 121:
 
===Tips===
 
===Tips===
 
====Volume Issues====
 
====Volume Issues====
When a volume becomes detached and/or shows in error, the state can be reset (available in web UI).
+
When a volume becomes detached and/or shows in error, the state can be reset:
 
:<code>source keystonerc_admin</code>
 
:<code>source keystonerc_admin</code>
 
:<code>cinder reset-state ''volume_id''</code>
 
:<code>cinder reset-state ''volume_id''</code>
 +
or use web UI.
  
 
====Update/Reboot/Shutdown Process====
 
====Update/Reboot/Shutdown Process====

Revision as of 02:44, 1 April 2015

Msgbox-WOPr.png Work In Progress
This article is currently undergoing major expansion or restructuring. You are welcome to assist by editing it as well. If this article has not been edited in several days, please remove this template.


Objective

This is a project to update and modernize the infrastructure that keeps the Amahi web sites and services running.

The idea is to provide easier and more sustainable management of the infrastructure to leave more time for the team to devote to moving the project forward.

NOTE: this project is not about running Amahi platform software on virtual servers, etc. For that there is a separate page on Virtualization.

Goals

We have multiple goals:

  • run some of internal build machines in a reliable, efficient way, so that we have consistent and updated builds/releases
  • have consistent and recent backups making things recoverable
  • run some testing of Amahi apps more easily and efficiently
  • test new features in an isolated manner

...

Known Issues

  • Controller node is memory intensive
  • Image resizing does not work
  • Volumes resizing does not work
  • Suspending instance with volume attached does not work after reboot

Pending Actions

  • Set up floating IP address range
  • Create automated VM backup routine
  • Create Fedora 19 32- and 64-bit minimal install images
  • Create Amahi 7 Express CD image

Hardware

Dell Rack Server

  • Dual Xeon E5450 3.0 GHz Processors
  • 32GB PC2-5300 RAM (8x4)
  • Two Gigabit Network Interfaces
  • KVM Network Interface
  • RAID Controller
  • Four Quick Swap Drive Bays
    • 1 - 1 TB (OS and Backup)
    • 2 - 120GB SSD (VMs)
    • 3 - Empty
    • 4 - Empty

Software

  • CentOS 7 x86_64 (Minimal)
  • OpenStack Juno Release

Setup

  • Download and install CentOS 7 x86_64 minimal image
  • Configure FQDN (/etc/hosts and /etc/hostname)
  • Manually configure networking (set static IP address)
  • Add users and private keys for SSH login
  • Disable SSH password and root login
  • Enable EPEL Repo
yum install epel-release
or
rpm -Uvh http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-2.noarch.rpm
  • Perform OS update
yum -y update
  • Install OpenStack following RDO Quickstart instructions (run packstack --allinone as root)
  • Configure network bridging (refer to RDO Reference)
    • Set CONFIG_PROVISION_ALL_IN_ONE_OVS_BRIDGE=n in packstack-answers-20141028-205455.txt
    • Executed packstack --answer-file=packstack-answers-20141028-205455.txt (as root)
    • Created /etc/sysconfig/network-scripts/ifcfg-br-ex
    • Revised /etc/sysconfig/network-scripts/ifcfg-enp9s0f0
    • Appended lines to /etc/neutron/plugin.ini
    • Restarted the network service
    • Removed router and subnet (ALL instances must be terminated to remove subnet)
    • Recreated subnet with IP address allocation range (set floating IP addresses)
    • Recreated router to match gateway
  • Configure DNS to access internet
    • Edit /etc/neutron/dhcp_agent.ini and uncomment the line below:
# Comma-separated list of DNS servers which will be used by dnsmasq
# as forwarders.
# dnsmasq_dns_servers = 
and add 8.8.8.8,8.8.4.4 after equal (=), then reboot as I could not determine what services to restart.
  • Extend cinder-volumes past 20GB to allow for creating additional volumes to attach to instances.
    • Followed the OpenStack Increase Volume Capacity tutorial (substitute any name for cinder-volumes)
    • After losetup -f /var/lib/cinder/cinder-volumes in /etc/rc.local file
      1. Added && losetup -f /mnt/backup/stack-volumes
      2. When extending cinder-volumes, add && losetup -f and location of new LVM file to sequence.
      3. Do pvs -v to verify, then reboot and recheck.
      4. This change is required to ensure the LVM remains intact on reboot.
    • Created 50Gb additional space for volumes (/mnt/backup/stack-volumes).
    • Total volume space available is now 70GB.

Build Images

This will outline how to build OpenStack images using Proxmox VE.

  • Log into Proxmox VE web UI
  • Create a VM or clone an existing one
    1. If creating a VM, install the OS
    2. If using a clone, start the VM
  • Open a console window for the VM
    1. Log in and as root do the following
      • dd if=/dev/zero of=/mytempfile bs=1M (zero out any unused space)
      • rm -f /mytempfile
    2. Shutdown VM
    3. Log into Proxmox VE via SSH and execute the following from command line
      • Navigate to /var/lib/vz/### (number of VM)
      • mv original_image.qcow2 original_image.qcow2_backup (rename original image)
      • qemu-img convert -O qcow2 original_image.qcow2_backup original_image.qcow2
      • Copy new .qcow2 image to a safe location for uploading into OpenStack
      • Remove .backup file
      • Delete the VM from Proxmox VE web UI
  • Use WINScp or similar program to copy the .qcow2 image to client machine
  • Upload into OpenStack via the web UI


REF: Reclaim disk space from .qcow2 or .vmdk image

Create Instance

This is a nice straight forward tutorial on Creating an instance.

Notes

  • the floating IPs situation may not work on non-externally routed IPs. this may be why they set up a 179. "public" network by detafult in the RDO setup. i deleted that network
  • the external network needs to be "flagged" as external. this cannot be done with the UI, but i am told the juno release has a feature where attribute editing. so that the external attribute can be set to Yes. once that is done, MAYBE the system allows floating IPs in that network even if the IP range is not externally routable
  • basically understand what it takes to get an image created, seasoned, and how we need to maintain these over long periods. i think the main workhorse is qcow2 tools.
  • these images are like "snapshots" in some way, but a snapshot is frozen and cannot be tweaked.
  • long term we want to make images like this for testing, e.g. and amahi 7 image that is bootable and it's plain instal. another example is a fully up to date amahi 7 image, etc.
  • so they are alive in that these images are frozen in time, but one takes a copy and can then evolve it into a new version of the image.
  • Refer to Fix inconsistent OpenStack volumes and instances from Cinder and Nova via the database for correcting instances in error (NOTE: Use extreme caution as this could corrupt the database.) ALWAYS back up the database before making any changes!

Tips

Volume Issues

When a volume becomes detached and/or shows in error, the state can be reset:

source keystonerc_admin
cinder reset-state volume_id

or use web UI.

Update/Reboot/Shutdown Process

  • Shutdown/Disconnect
    • Stop all instances via SSH
    • Detach volumes from instances
    • Verify all volumes detached and instances stopped
    • Perform Update/Reboot
  • Once system rebooted:
    • Reattach volumes to instances
    • Start all needed instances
    • Verify all instances are operational
    • For F-19 Repo instance, also execute:
systemctl start redis
systemctl stop iptables
sudo su -
su - username
./start-mirror.sh