This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [cfarm-admins] Extremely Slow Disk Access On GCC119


Thank you for the reply!

On Sun, Sep 10, 2017 at 2:49 PM, David Edelsohn <dje.gcc@gmail.com> wrote:
> On Sun, Sep 10, 2017 at 9:08 PM, R0b0t1 <r030t1@gmail.com> wrote:
>> On Sun, Sep 10, 2017 at 1:06 PM, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>>> Yes, the disks are very slow. You just have to live with it.
>>>
>>
>> The commands I was having problems with are rm and tar (to decompress
>> an xz archive, and to delete a failed compilation environment setup).
>> The software project in question is Gentoo's Portage, which is known
>> for "stressing" filesystems due to the high file/inode count of its
>> resource base (automated build scripts for various software projects).
>> Unxz ran for the better part of a day with no end in sight. It should
>> take 2 minutes or less. The compilation of necessary core packages
>> (Gentoo's @system) took, again, over a day, with no end in sight, but
>> on the Linux machines was done in a few minutes.
>>
>> (I am actually very thankful that there seem to be no hard limits on
>> CF user accounts. The space required for portage isn't gigantic but
>> might be larger than what some people are willing to let users drop
>> into their $HOME.)
>>
>> It is this that makes me think the issue is not solely hardware. Bad
>> management of writes seems like a more likely cause especially because
>> while some IO limited operations seem to be slower than other disks I
>> have used, they are not extremely slow.
>>
>> If the issue has been provably linked to the disk controller, I would
>> appreciate an explanation as I am interested in how that was done. I
>> apologize for bothering anyone but I would stress I am very willing to
>> listen. I am simply not very smart, sirs. If my lack of intelligence
>> is insulting please say so and I will leave.
>
> I and a number of AIX VIOS experts performed an assessment of the
> system with AIX performance tools.
>

Would you mind describing what was done in detail? I understand if you
do not have the time, but I am genuinely interested.

I ask because what I am doing can be pathological on Linux, and using
IBM-provided tools doesn't mean you're going to discover why something
is happening. It's going to show you what it thinks might be
happening, and suggest fixes that are (hopefully) within your power to
actualize. Most importantly an IBM-provided tool is unlikely to
criticize anything made by IBM. Consequently, I see no reason that
performing an assessment (of what and how?) disproves any claim that
there is something wrong with the AIX kernel.

I seem to be able to find no better way to express the sentiment, so
please understand I mean no disrespect: You may as well have told me
you were an expert in propeller beanies. No explanation is owed to me
but I can't in good conscience take the explanation given as
comprehensive.

> The system was configured to maximize diskspace and flexibility.  It
> now is supporting six, separate VMs.  The disk array was configured as
> a single physical volume, mapped to a single logical volume, that then
> is partitioned into virtual I/O devices mapped to the VMs, which then
> are formatted for AIX filesystems.  It's a lot of virualization
> layers. I already have increased the disk queues in the AIX VMs, which
> increased performance relative to the initial installation.  Also, the
> VIOS was slightly under-sized for the current amount of usage, but I
> have avoided rebooting the entire system to adjust that.
>

If I understand the setup properly, this should produce little
noticeable slowdown. The layers hand off data with very little
processing.

Linux host systems tend to have similar setups with LVM2. It might
even be a good idea to keep PV/LV setup despite the overhead because
it's so much more flexible.

> Ideally it would be better to directly partition the disk array and
> map the partitions to the AIX VMs, but it is difficult and disruptive
> to implement now.  There is a proposal to replace the disk array
> device adapter with a write-caching adapter, which may or may not
> happen.
>

I'm not entirely sure. In an absolute sense there would be less
overhead but relative to other slowdowns on the system I am not sure
the inefficiency of the abstraction layers matters.

It might just be the case that the other guests are nearly saturating
the disk IO. It's not my place to ask what they're doing, but if they
are doing that, then I suppose there's nothing to be done about it.

In my personal experience however, it seems like Linux would fare
better in this situation. Blaming the AIX kernel might be useful if
there are tuning parameters exposed that could be changed. IO queues
are very hard to get right.

>>
>>> Obviously migrating gcc119 to Linux would be silly when we have other
>>> Linux machines. The whole point of this one is to have an AIX host for
>>> testing in AIX.
>>>
>>
>> True, which is why I pointed it out. However the system is very hard
>> to use. I suppose this is useful information about AIX.
>>
>>
>> I also have an outstanding request for PowerKVM support (on GCC112)
>> and access to the hypervisor, but in that case I do not expect a
>> prompt response at all. However, I am slightly worried that it may
>> never be addressed (even if the answer is "no," which would be very
>> sad).
>
> GCC112 will not provide user access to the hypervisor.  You can ask
> the OSUOSL Powerdev cloud if they will provide such access.
>

Thank you for the referral, I will see if that would help. I do
appreciate the access to a POWER8 system already available and the
time you have given your responses.

Respectfully,
     R0b0t1


> Thanks, David
>
>>
>> As for the value of what I am doing: most of the CF systems are so
>> outdated as to be useless for modern development work. I can use a
>> Gentoo Prefix/libc (https://wiki.gentoo.org/wiki/Prefix/libc)
>> installation to use modern software on an outdated system. I'm
>> currently fixing a lot of ppc64(le) issues, but I did get an
>> environment working on GCC10.
>>
>> Respectfully,
>>      R0b0t1
>>
>>>
>>>
>>> On Sunday, 10 September 2017, R0b0t1 <r030t1@gmail.com> wrote:
>>>>
>>>> Hello list, has anyone experienced problems on the AIX POWER8 system?
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: R0b0t1 <r030t1@gmail.com>
>>>> Date: Sun, Sep 10, 2017 at 9:58 AM
>>>> Subject: Re: [cfarm-admins] Extremely Slow Disk Access On GCC119
>>>> To: David Edelsohn <dje.gcc@gmail.com>
>>>>
>>>>
>>>> Do you care to explain? I'm not trying to tell you how to do your job,
>>>> I'm trying to make you aware of a problem.
>>>>
>>>> I do not understand how this could only be a problem with the disk
>>>> controller, because I have never encountered such poor performance. If
>>>> it is only a problem with the disk, has the disk been failing this
>>>> whole time?
>>>>
>>>> I hope you do not mind, but I will be forwarding this to the GCC
>>>> mailing list. I am concerned by your replies.
>>>>
>>>>
>>>> On Sun, Sep 10, 2017 at 3:47 AM, David Edelsohn <dje.gcc@gmail.com> wrote:
>>>> > You're analysis is completely wrong.
>>>> >
>>>> > David
>>>> >
>>>> > On Sep 10, 2017 10:08 AM, "R0b0t1" <r030t1@gmail.com> wrote:
>>>> >>
>>>> >> Thank you for the response!
>>>> >>
>>>> >> I don't necessarily mean to request funding for the AIX system in this
>>>> >> ticket but I feel like I should point out that the system is more or
>>>> >> less unusable if something requires any IO. Assuming the HD is
>>>> >> anything modern it seems to me this is a scheduling or caching issue
>>>> >> in the AIX kernel.
>>>> >>
>>>> >> To reiterate, something that should take a few minutes took a day due
>>>> >> to slow IO. Something that should have taken a fraction of a second
>>>> >> was taking minutes.
>>>> >>
>>>> >> That it could possibly be the AIX kernel makes me want to ask if it
>>>> >> would be possible to migrate the system to Linux. However, I would
>>>> >> assume someone needs it for doing testing on AIX. I can also just as
>>>> >> well use GCC112.
>>>> >>
>>>> >> I am doing my best to not to monopolize GCC119, but it is very hard to
>>>> >> design around certain IO operations being very slow. I am not sure if
>>>> >> this is impacting any users; it doesn't seem like the AIX system
>>>> >> receives regular use.
>>>> >>
>>>> >> Cheers,
>>>> >>      R0b0t1
>>>> >>
>>>> >> On Sun, Sep 10, 2017 at 1:45 AM, David Edelsohn <dje.gcc@gmail.com> wrote:
>>>> >> > The backing disk array for the virtual I/O disks was under-designed
>>>> >> > for the configuration of the system.  There is a request to upgrade
>>>> >> > the disk controller, but unclear if that will be funded.  The I/O
>>>> >> > system already has been tuned with bigger buffers so the performance
>>>> >> > is much better than it was when originally installed.
>>>> >> >
>>>> >> > Please remember that all of the systems are shared systems and if
>>>> >> > there is some limitations to the system, it affects all users, so
>>>> >> > please try not to monopolize or overload the systems.
>>>> >> >
>>>> >> > Thanks, David
>>>> >> >
>>>> >> > On Sun, Sep 10, 2017 at 8:18 AM, R0b0t1 via cfarm-admins
>>>> >> > <cfarm-admins@lists.tetaneutral.net> wrote:
>>>> >> >> Hello,
>>>> >> >>
>>>> >> >> I apologize for creating another ticket. I am in the process of
>>>> >> >> running rm on a directory structure which is very quickly removed on
>>>> >> >> the Linux CF machines I have used the command on.
>>>> >> >>
>>>> >> >> In addition the setup of a few software packages took about a day in
>>>> >> >> total due to disk access when running configure scripts. On Linux,
>>>> >> >> this completes in a matter of minutes.
>>>> >> >>
>>>> >> >> Is this a problem with AIX/jfs?
>>>> >> >>
>>>> >> >> Respectfully,
>>>> >> >>      R0b0t1
>>>> >> >> _______________________________________________
>>>> >> >> cfarm-admins mailing list
>>>> >> >> cfarm-admins@lists.tetaneutral.net
>>>> >> >> https://lists.tetaneutral.net/listinfo/cfarm-admins


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]