Monday, August 31, 2015

Exception: Could not bind to 0.0.0.0:8080 after trying for 30 seconds

I was deploying Swift using Seagate's Kinetic drives following the deployment guide in github https://github.com/swiftstack/kinetic-swift/wiki/Deployment

anfield@football:~$ sudo swift-init start main
[sudo] password for mayur:
Starting proxy-server...(/etc/swift/proxy-server.conf)
Starting container-server...(/etc/swift/container-server.conf)
Starting account-server...(/etc/swift/account-server.conf)
Starting object-server...(/etc/swift/object-server.conf)
Traceback (most recent call last):
  File "/usr/local/bin/swift-proxy-server", line 6, in <module>
    exec(compile(open(__file__).read(), __file__, 'exec'))
  File "/home/mayur/git/kinetic-swift/swift/bin/swift-proxy-server", line 23, in <module>
    sys.exit(run_wsgi(conf_file, 'proxy-server', **options))
  File "/home/mayur/git/kinetic-swift/swift/swift/common/wsgi.py", line 878, in run_wsgi
    error_msg = strategy.bind_ports()
  File "/home/mayur/git/kinetic-swift/swift/swift/common/wsgi.py", line 480, in bind_ports
    self.sock = get_socket(self.conf)
  File "/home/mayur/git/kinetic-swift/swift/swift/common/wsgi.py", line 201, in get_socket
    bind_addr[0], bind_addr[1], bind_timeout))
Exception: Could not bind to 0.0.0.0:8080 after trying for 30 seconds

I ran the 'netstat -lntp' command to see which process is using the port 8080.

anfield@football:~$ netstat -lntp
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:6002            0.0.0.0:*               LISTEN      -              
tcp        0      0 127.0.1.1:53            0.0.0.0:*               LISTEN      -              
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN      -              
tcp        0      0 127.0.0.1:11211         0.0.0.0:*               LISTEN      -              
tcp        0      0 0.0.0.0:6000            0.0.0.0:*               LISTEN      -              
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN      4303/python    
tcp        0      0 0.0.0.0:6001            0.0.0.0:*               LISTEN      -              
tcp6       0      0 ::1:631                 :::*                    LISTEN      -              
anfield@football:~$

The 1 indicates a HDD and 0 indicates SSD


 Now that I know which are SSDs and which are HDDs. Here is how I got detailed information on the HDD and SSD

anfield@football:~$ ps -ef | grep 4303
mayur     4303  1541  1 14:22 ?        00:00:03 /usr/bin/python /usr/local/bin/swift-proxy-server /etc/swift/proxy-server.conf
mayur     4308  4303  0 14:22 ?        00:00:00 /usr/bin/python /usr/local/bin/swift-proxy-server /etc/swift/proxy-server.conf
mayur     4375  2432  0 14:26 pts/1    00:00:00 grep --color=auto 4303
anfield@football:~$
anfield@football:~$
anfield@football:~$ kill -9 4303
anfield@football:~$
anfield@football:~$ ps -ef | grep 4303
mayur     4378  2432  0 14:26 pts/1    00:00:00 grep --color=auto 4303
anfield@football:~$

The screen shot above shows the Python process that is using port 8080. I killed the process, and then restarted Swift

The Swift process came up fine after that as shown below.

anfield@football:~$ netstat -lntp
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:6002            0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.1.1:53            0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.1:11211         0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:6000            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:6001            0.0.0.0:*               LISTEN      -
tcp6       0      0 ::1:631                 :::*                    LISTEN      -

anfield@football:~$ sudo swift-init start main
Starting proxy-server...(/etc/swift/proxy-server.conf)
container-server running (4322 - /etc/swift/container-server.conf)
container-server already started...
account-server running (4323 - /etc/swift/account-server.conf)
account-server already started...
object-server running (4324 - /etc/swift/object-server.conf)
object-server already started...
anfield@football:~$


Friday, January 16, 2015

Now are they NIC’s 1 GbE or 10 GbE?

In my previous post we talked about how to figure out if the local drives on the X86 box are SSDs or HDD. Now this machine also has multiple NICs, 1 GbE and two 10 GbE. I plan on using the 1 GbE for management operations, and would like to use the 10 GbE for outbound(client) and ibound (storage/cluster) use.

Using the ethtool we can figure out the speed of the NIC’s

kinetic@paco1:~$ ethtool eth0
Settings for eth0:
    Supported ports: [ TP ]
    Supported link modes:   10baseT/Half 10baseT/Full
                            100baseT/Half 100baseT/Full
                            1000baseT/Full
    Supported pause frame use: Symmetric
    Supports auto-negotiation: Yes
    Advertised link modes:  10baseT/Half 10baseT/Full
                            100baseT/Half 100baseT/Full
                            1000baseT/Full
    Advertised pause frame use: No
    Advertised auto-negotiation: Yes
    Speed: 1000Mb/s
    Duplex: Full
    Port: Twisted Pair
    PHYAD: 1
    Transceiver: internal
    Auto-negotiation: on
    MDI-X: on (auto)
Cannot get wake-on-lan settings: Operation not permitted
    Current message level: 0x00000007 (7)
                   drv probe link
    Link detected: yes


kinetic@paco1:~$ ethtool p513p2
Settings for p513p2:
    Supported ports: [ FIBRE ]
    Supported link modes:   10000baseT/Full
    Supported pause frame use: No
    Supports auto-negotiation: No
    Advertised link modes:  10000baseT/Full
    Advertised pause frame use: No
    Advertised auto-negotiation: No
    Speed: 10000Mb/s
    Duplex: Full
    Port: Direct Attach Copper
    PHYAD: 0
    Transceiver: external
    Auto-negotiation: off
Cannot get wake-on-lan settings: Operation not permitted
    Current message level: 0x00000007 (7)
                   drv probe link
    Link detected: yes

Another useful tool is the lspci.

kinetic@paco1:~$ lspci -vv | grep -i ethernet
02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
02:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
02:00.2 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
02:00.3 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
04:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
04:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)

Now lets just redirect all the o/p from lspci and have a look at it.

kinetic@paco1:~$ lspci -vv > nic.txt


02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
        Subsystem: Intel Corporation Device 3582
        Physical Slot: 2
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 27
        Region 0: Memory at d0960000 (32-bit, non-prefetchable) [size=128K]
        Region 2: I/O ports at 3060 [size=32]
        Region 3: Memory at d09b0000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: <access denied>
        Kernel driver in use: igb


04:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
        Subsystem: Intel Corporation Device 3557
        Physical Slot: 2-2
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 32
        Region 0: Memory at d0c20000 (64-bit, non-prefetchable) [size=128K]
        Region 2: I/O ports at 2020 [size=32]
        Region 4: Memory at d0c50000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: <access denied>
        Kernel driver in use: ixgbe

Thursday, January 15, 2015

Are the disks HDDs or SSDs?

I was given a X86 system running Ubuntu that had some local disks, some of which were HDDs while the other were SSDs. My plan is to use the SSD's for the account and container data of the Swift deployment.

kinetic@paco1:~$ uname -a
Linux paco1 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

To find out which are HDD and which are SSD do the following.

kinetic@paco1:~$ cat /sys/block/sda/queue/rotational
1
kinetic@paco1:~$ cat /sys/block/sdb/queue/rotational
0
kinetic@paco1:~$ cat /sys/block/sdc/queue/rotational
0
kinetic@paco1:~$ cat /sys/block/sdd/queue/rotational
0

The 1 indicates a HDD and 0 indicates SSD


 Now that I know which are SSDs and which are HDDs. Here is how I got detailed information on the HDD and SSD

kinetic@paco1:~$ sudo hdparm -I /dev/sda

/dev/sda:

ATA device, with non-removable media
    Model Number:       ST1000NM0033-9ZM173                   
    Serial Number:      Z1W10W30
    Firmware Revision:  SN03  
    Transport:          Serial, SATA Rev 3.0
Standards:
    Supported: 9 8 7 6 5
    Likely used: 9
Configuration:
    Logical        max    current
    cylinders    16383    16383
    heads        16    16
    sectors/track    63    63
    --
    CHS current addressable sectors:   16514064
    LBA    user addressable sectors:  268435455
    LBA48  user addressable sectors: 1953525168
    Logical  Sector size:                   512 bytes
    Physical Sector size:                   512 bytes
    Logical Sector-0 offset:                  0 bytes
    device size with M = 1024*1024:      953869 MBytes
    device size with M = 1000*1000:     1000204 MBytes (1000 GB)
    cache/buffer size  = unknown
    Form Factor: 3.5 inch
    Nominal Media Rotation Rate: 7200
Capabilities:
    LBA, IORDY(can be disabled)
    Queue depth: 32
    Standby timer values: spec'd by Standard, no device specific minimum
    R/W multiple sector transfer: Max = 16    Current = ?
    Recommended acoustic management value: 254, current value: 0
    DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
         Cycle time: min=120ns recommended=120ns
    PIO: pio0 pio1 pio2 pio3 pio4
         Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
    Enabled    Supported:
       *    SMART feature set
            Security Mode feature set
       *    Power Management feature set
       *    Write cache
       *    Look-ahead
       *    Host Protected Area feature set
       *    WRITE_BUFFER command
       *    READ_BUFFER command
       *    DOWNLOAD_MICROCODE
            SET_MAX security extension
       *    48-bit Address feature set
       *    Mandatory FLUSH_CACHE
       *    FLUSH_CACHE_EXT
       *    SMART error logging
       *    SMART self-test
       *    General Purpose Logging feature set
       *    WRITE_{DMA|MULTIPLE}_FUA_EXT
       *    64-bit World wide name
       *    IDLE_IMMEDIATE with UNLOAD
            Write-Read-Verify feature set
       *    WRITE_UNCORRECTABLE_EXT command
       *    {READ,WRITE}_DMA_EXT_GPL commands
       *    Segmented DOWNLOAD_MICROCODE
            unknown 119[6]
       *    unknown 119[7]
       *    Gen1 signaling speed (1.5Gb/s)
       *    Gen2 signaling speed (3.0Gb/s)
       *    Gen3 signaling speed (6.0Gb/s)
       *    Native Command Queueing (NCQ)
       *    Phy event counters
       *    Idle-Unload when NCQ is active
       *    READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
            DMA Setup Auto-Activate optimization
            Device-initiated interface power management
       *    Software settings preservation
            unknown 78[7]
       *    SMART Command Transport (SCT) feature set
       *    SCT Write Same (AC2)
       *    SCT Error Recovery Control (AC3)
       *    SCT Features Control (AC4)
       *    SCT Data Tables (AC5)
            unknown 206[7]
            unknown 206[12] (vendor specific)
            unknown 206[14] (vendor specific)
Security:
    Master password revision code = 65534
        supported
    not    enabled
    not    locked
    not    frozen
    not    expired: security count
        supported: enhanced erase
    116min for SECURITY ERASE UNIT. 116min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 5000c500656fbf99
    NAA        : 5
    IEEE OUI    : 000c50
    Unique ID    : 0656fbf99
Checksum: correct

For the SSD

kinetic@paco1:/dev$ sudo hdparm -I /dev/sdb

/dev/sdb:

ATA device, with non-removable media
    Model Number:       ST240HM000-1G5152                     
    Serial Number:      Z4N005W0
    Firmware Revision:  C675  
    Transport:          Serial, SATA Rev 3.0
Standards:
    Used: unknown (minor revision code 0x0029)
    Supported: 8 7 6 5
    Likely used: 8
Configuration:
    Logical        max    current
    cylinders    16383    0
    heads        16    0
    sectors/track    63    0
    --
    LBA    user addressable sectors:  268435455
    LBA48  user addressable sectors:  468862128
    Logical  Sector size:                   512 bytes
    Physical Sector size:                  4096 bytes
    Logical Sector-0 offset:                  0 bytes
    device size with M = 1024*1024:      228936 MBytes
    device size with M = 1000*1000:      240057 MBytes (240 GB)
    cache/buffer size  = unknown
    Form Factor: 2.5 inch
    Nominal Media Rotation Rate: Solid State Device
Capabilities:
    LBA, IORDY(can be disabled)
    Queue depth: 32
    Standby timer values: spec'd by Standard, no device specific minimum
    R/W multiple sector transfer: Max = 16    Current = 16
    DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
         Cycle time: min=120ns recommended=120ns
    PIO: pio0 pio1 pio2 pio3 pio4
         Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
    Enabled    Supported:
       *    SMART feature set
            Security Mode feature set
       *    Power Management feature set
       *    Write cache
       *    Look-ahead
       *    Host Protected Area feature set
       *    WRITE_BUFFER command
       *    READ_BUFFER command
       *    NOP cmd
       *    DOWNLOAD_MICROCODE
       *    48-bit Address feature set
       *    Mandatory FLUSH_CACHE
       *    FLUSH_CACHE_EXT
       *    SMART error logging
       *    SMART self-test
       *    General Purpose Logging feature set
       *    WRITE_{DMA|MULTIPLE}_FUA_EXT
       *    64-bit World wide name
       *    WRITE_UNCORRECTABLE_EXT command
       *    Segmented DOWNLOAD_MICROCODE
       *    Gen1 signaling speed (1.5Gb/s)
       *    Gen2 signaling speed (3.0Gb/s)
       *    Gen3 signaling speed (6.0Gb/s)
       *    Native Command Queueing (NCQ)
       *    Host-initiated interface power management
       *    Phy event counters
       *    READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
            Non-Zero buffer offsets in DMA Setup FIS
            DMA Setup Auto-Activate optimization
            In-order data delivery
       *    Software settings preservation
       *    DOWNLOAD MICROCODE DMA command
       *    Data Set Management TRIM supported (limit 1 block)
Security:
    Master password revision code = 65534
        supported
    not    enabled
    not    locked
    not    frozen
    not    expired: security count
    not    supported: enhanced erase
    8min for SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 5000c5002ff09784
    NAA        : 5
    IEEE OUI    : 000c50
    Unique ID    : 02ff09784
Checksum: correct



Tuesday, December 23, 2014

Monitoring Network Bandwidth of an OpenStack Swift deployment running on Kinetic drives

I’m planning on running some performance tests on my OpenStack Swift deployment running on Seagate Kinetic drive. The Kinetic drives are Ethernet drives ie. the Swift proxy server access the drives using REST like API’s.

While looking for Network monitoring tools running on Linux I came across this very site, that talks about some 18 or so tools to monitor network bandwidth on Linux server.

I evaluated a few of them, and found two that would my requirements. The tools that I short-listed are cbm (Color Bandwidth Meter) and tcptrack.

Here are couple of screen shots from each of the tools.

The tcptrack tool tells me all the connections made from the Clinet(the Swift PACO node) to the Kinetic drives. The first screen-shot was while running the getput benchmarking tool, while the second was using the knobs benchmarker-swift-write.py script
 


The cbm tool tells me Total data transfered from each of the network interfaces on the PACO server. Similar to above, the first screen-shot was while running the getput benchmarking tool, while the second was using the knobs benchmarker-swift-write.py script



NOTE: As this was just a dry run for the check the network monitoring tools, I just used a single Kinetic drive to create the Object ring.

Tuesday, November 11, 2014

Monday, November 10, 2014

Ring Ring Ring!!!

In OpenStack Swift there are 3 Rings, an Account ring, a Container ring, and an Object ring.

What are these Rings and what are they made of?
The rings are modified consistent hash rings that hold partitions that are on the physical disks.

So what are partitions?
Think of partitions as directories that hold the actual data.

And how may partitions are there on the a disk drive?
Swift documentation recommends a minimum of 100 partitions per drive. We have seen customers use up to 1000 partitions per drive.

What are these Rings used for?
The ring is used to help us find the information and data that we are looking for when do a GET. During a PUT, the ring tells us where(physical disk) the data will be stored.

What do the data structures in the Ring tell us?
Actually there are two internal data structures. The first one tells us where the three replications of a partition are stored eg. Partition 0 is on device 7, 12, and 1. While partition 2 is on device 1, 8 and 10


The second data structure tells us where and how to find these devices. Eg. Device 1 is in region 2, zone 1, with the following IP address and port number.







Monday, October 27, 2014

OpenStack Swift container creation error.

I was following the https://github.com/swiftstack/kinetic-swift/wiki/Deployment page to deploy Swift using Kinetic drives.

After creating the account, container, and object rings I started Swift using “swift-init start main”. The Swift processes came up fine, and I was also able to do “swift stat”.

Next, when I tried to create a container called foobar, I got this error message.

user@vm:~$ sudo swift -U test:tester -K testing -A http://localhost:8080/auth/v1.0 post foobar
Container PUT failed: http://localhost:8080/v1/AUTH_test/foobar 404 Not Found  [first 60 chars of response] <html><h1>Not Found</h1><p>The resource could not be found.<

Looking at /var/log/syslog showed that there was an issue with creating the accounts directory in /swift/sdv due to permission issues.

user@vm:~$ tail -f /var/log/syslog
Oct 21 15:22:34 vm account-server: ERROR __call__ error with PUT /sdv/802/AUTH_test : #012Traceback (most recent call last):#012  File "/home/user/git/kinetic-swift/swift/swift/account/server.py", line 274, in __call__#012    res = method(req)#012  File "/home/user/git/kinetic-swift/swift/swift/common/utils.py", line 2422, in wrapped#012    return func(*a, **kw)#012  File "/home/user/git/kinetic-swift/swift/swift/common/utils.py", line 1023, in _timing_stats#012    resp = func(ctrl, *args, **kwargs)#012  File "/home/user/git/kinetic-swift/swift/swift/account/server.py", line 142, in PUT#012    broker.initialize(timestamp.internal)#012  File "/home/user/git/kinetic-swift/swift/swift/common/db.py", line 232, in initialize#012    mkdirs(self.db_dir)#012  File "/home/user/git/kinetic-swift/swift/swift/common/utils.py", line 740, in mkdirs#012    os.makedirs(path)#012  File "/usr/lib/python2.7/os.py", line 150, in makedirs#012    makedirs(head, mode)#012  File "/usr/lib/python2.7/os.py", line 150, in makedirs#012    makedirs(head, mode)#012  File "/usr/lib/python2.7/os.py", line 150, in makedirs#012    makedirs(head, mode)#012  File "/usr/lib/python2.7/os.py", line 157, in makedirs#012    mkdir(name, mode)#012OSError: [Errno 13] Permission denied: '/swift/sdv/accounts' (txn: txc2152db00f0a4ee5ba0b8-005446dcaa)
Oct 21 15:22:34 vm account-server: 127.0.0.1 - - [21/Oct/2014:22:22:34 +0000] "PUT /sdv/802/AUTH_test" 500 1132 "-" "txc2152db00f0a4ee5ba0b8-005446dcaa" "-" 0.0143 "-"
Oct 21 15:22:34 vm proxy-server: Container GET returning 503 for (503,) (txn: txc2152db00f0a4ee5ba0b8-005446dcaa) (client_ip: 127.0.0.1)
Oct 21 15:22:34 vm proxy-server: Could not autocreate account '/AUTH_test' (txn: txc2152db00f0a4ee5ba0b8-005446dcaa) (client_ip: 127.0.0.1)
Oct 21 15:22:34 vm proxy-server: 127.0.0.1 127.0.0.1 21/Oct/2014/22/22/34 PUT /v1/AUTH_test/foobar HTTP/1.0 404 - python-swiftclient-1.8.0.7.g775a24b AUTH_tkceea92749... - 70 - txc2152db00f0a4ee5ba0b8-005446dcaa - 0.0204 - - 1413930154.009783983 1413930154.030232906  


user@vm:/swift$ ls -l total 4 d-wxrw-r-t 2 root root 4096 Oct 17 16:04

Fix:
user@vm:/swift$ sudo chmod o+rw /swift/sdv