2005-10-24 14:32

If our site has fewer than 20 Prognosis nodes, using Auto-Detection is reasonable. If more than 20 nodes, switch to Managed Nodes because Auto Detection becomes too cumbersome.

Prognosis overhead can be quite heavy if collection interval is too frequent, so be careful.

Once again stuck with crappy Windows desktop for class. Installed gvim, gaim, firefox, and putty was already installed.

Is the GUI still Windows only? Yes – very, very sadly.

The WEB server is the piece that talks to either the managing node, or any node on an auto-detection enabled Prognosis zone. So does the web server need to run on the same host as the managing node? The webserver can run on any Apache & Tomkat server (and some other servers) and can run on Windows, Linux, or Non-Stop just fine.

Can alarms be sent to Netcool Omnibus? If not, what about logged in plain text to an OSS file? If not, what about logged to EMS which can trigger a Netcool Omnibus message?

Netcool through “Enterprise Manager” product, that we have licensed.

Integrated Research makes Prognosis:

There are a few Prognosis customers who are not running any NSK at all! 😉

Sometimes the network manager gets very busy, and we need to learn how to determine who is causing the extra workload (usually a GUI).

AOL has licensed the following Prognosis products:
ADI (app dev interface), AO (auto operations), BM (batch), DM (disk), EM (enterprise/Netcool), GUI, PM (performance), WVO (Web).

A large chunk of this “training” class is a sales briefing on the different Prognosis products… sigh…

Prognosis server software probably opens a lot of security holes. Integrated Research does not focus on Security except as a reaction to customer demands, so there is no security by default, and what can be turned on by the installer can probably be turned off easily enough. Have OPSSEC/SYSSEC do an audit.

Documentation is all in either PDF or Winblows Online Help (sigh). At least the PDF’s are self indexed… so we can copy the PDFs to a real server.

Day 2:

It is important to create your own “key performance indicators” displays, that are built by copying and pasting specific windows from existing displays in the Knowledge tree in the Documents Navigator.

Once we have a display we like, export it to the web server, and that will be available for all the web users to view.

Page 57 deals with how to create a custom display.

Page 73 has an undocumented feature, use of Node value wrapped in greater than and less than signs.

Prognosis can measure, graph, and threshold/alarm/action WebLogic and WebSphere, Apache, J2EE Servlets (Tomkat) and even DB2 applications out of the box!

Links and Drilldowns are key to making your main KPI displays really useful.

Auto Documentation tool is AWESOME!!!

Day 3:

Thresholds are awesome (like the PMIE of CoPilot). Timing Parameters can be tricky, but very useful (page 150 and 151).

Alarm WHERE BUSY > 95 AND QLEN > 3
Timing Check Interval of 1 minutes, log at most every 10 minutes, log after 4 consecutive intervals to EMS
Timing Check Interval of 1 minutes, log after 4 consecutive intervals and log off when event over to NetCool

Reasons to Prognosis:
Fix the damned false alarms problem (CPU Busy to NOC)
start actually having SYSQ system capacity clides
replace logcheck alarms with the dispatcher
catch MFSERV alarms and netcool them
start catch RDF delays
share EMS logs via web to NOC

^srcnode@ will embed the nodename in the message

Ask Tony Guliani what “destination” they used for their Netcool alarms from Prognosis…

Day 4:

Thresholds are very useful for finding conditions within a single record and triggering reactions (page/netcool/email/command).

Analysts are good for connecting conditions from multiple records – but it requires the use of a scripting language. Language is TACL like. Analysts also cost more (separate product to license).

Both thresholds and analysts have password issues with running commands (triggered events from Prognosis that involve remote commands). Is that true of IRCMD based instances on the server?

Create Thresholds to alarm when cron or nos have died and remain dead.

Copy the example NSKAlerts Threshold from Knowledge to My Displays – then modify with AOL site specific changes, and again for DLS05 (batch machine) specific changes.

Web Interface Web Server Requirements: A web server (Apache/AOLServer/IIS), a Java Servlet Container (Tomkat 4.1/WebSphere 5.0/WebLogic 7.0), and JDK 2.0 SDK.

Web browser requirements: HTML4, Java Applets (API 1.1 or newer), JavaScript (1.2 or newer), Session Cookies

The webserver host will need a free license we need to request from IR so that IRWASP will run (interfaces from host to Prognosis running servers).

The “Single Node Key Indicators” web display is AWESOME.

Day 5:

Quick view of the Alerts “Atlas” – which only becomes useful if the NSK Alerts threshold has been customized for our site by copying the existing one, then modifying our copy.

Clearly TCPIP Error Rate (NSK TCP Errors) threshold needs to be increased to 1% or perhaps even 2%, or we’ll get alarms endlessly. Spooler files should be excluded from the File Full threshold. TMF Very Long Transactions needs a filter to ignore planned batch jobs (how?). XYGATESR can be excluded from the NSK Long Process Queueing threshold.

NSKCommunicationLink Record as seen in Perf TCPIP All Sockets window/display is effectively a netstat -an for QA and Dev to use. Tune the display for the DEV/QA/OPS interfaces on MFDEV. VERY COOL.

Be sure to create a Network Share Drive for all the Windows Prognosis GUI users to share their Prognosis Documents!!! This should be backed up regularly.

Each Prognosis “Product” has it’s own configuration.

Key Configurations: Prognosis, Security, Disk Manager, UpDown/Availability, NodeGroup, and Dispatch Manager.

AOL Useful Configurations: Comms, Availability, Disk, Display, Extractor, Network, Node group, passwords, pathway, prognosis, security, up/down.

Bottom Line, these are things we should be doing here at AOL that Prognosis can do for us. Is it the only way to do these things?

1) Fix false alarms to NOC, specifically the CPU busy alarms. Right now our netcool alarms come from the EMS scraper that only Melvin and Tony know how to maintain. We need more people to know that process. We need to be able to delay alarm until a CPU has stayed busy for more than a minute and also has a queue depth greater than 1 or 2. Currently not possible – possible with Prognosis, with anything else?

2) start automatically reporting weekly system performance slides at SYSQ.

3) start catching RDF slowness automatically

4) replace Logcheck alarms since logcheck is not working now because of the sockets pm problem in perl.

5) catch MFSERV process status alarms

6) share EMS logs and viewpt/tracker/sysbusy data with NOC and DEV and QA over the web so they no longer need to know Tandem to get that data.

7) give Kavita and Amy their netstat -an output through a webform instead of SCF commands.

8) have a record of what, specifically, were the busiest processes at any given time for the last recorded periods (days). That way a specific CPU busy can be traced back to a root cause. No more mysteries!

The Prognosis Admin for any site should know how much resources Prognosis is using at any time. Prognosis Status Display.

Ask Integrated Research what is new, do this regularly. They (IR) are not good about pro-active notifications.


2005-10-17 13:19

Ryan Upton rupton@bea.com is instructor

Course venue sucks. Crappy Winblows computers with guest accounts and only 256MB of RAM, which Ryan says is not enough to do WebLogic server stuff. Why are we doing this on Windows at all? We all use only HP-UX/Linux/Solaris servers… sigh.

Installed GAIM and used the existing Firefox on the Windows PC, used telnet windows to get on Abelard. Using ELM to read mail.

Documentation at:

Java 2 Platform, Enterprise Edition (J2EE) defines the standard for developing component-based multitier enterprise applications.
Enterprise Information Systems (EIS) – slide on page 14 is useful.

Middleware makes it appear to the back end servers, that there are only a few, very busy users connected (saves on resources).

Standards are agreed upon ahead of time.

Java makes me want to hurl. Too many clueless, self-proclaimed coders can write bad code with no compile time checks to validate their crap.

Slide on page 18 is really useful, blow one of these up for the office wall. J2EE Overview Slide.

ODBC/JDBC – database connectivity.
JNDI – Naming and Directory Interface: DNS/LDAP for this API.
EJB (Enterprise Java Beans)

a server is the weblogic.server itself, one instance
a machine is the host that runs the server(s)
a cluster is a logical group of servers
a domain is a related group of servers and cluster(s)
an adminstration server is the central control for a domain, “there can only be one”
a managed server is any server in a domain that is not the admin server (controlled remotely)

Silent mode is the way to install – via script – so we can repeat the install identically on hundreds of boxes via scripts.

Labs 1 and 2 are redundant, do lab 2 because it is most like what we’ll use at work. Do labs 2 and 3.

Go to http://e-docs.bea.com/platform/docs81/install/silent.html#1044118
Get and save as silent.xml the example file. Edit it until you are ahppy.

Insert the PWIN CD in the drive.
cd installation_platform814_win
cd “Documents and Settings”\guestuser\Desktop
d:\installation_platform814_win\platform814_win32.exe -mode=silent -silent_xml=c:\”Documents and Settings”\guestuser\Desktop\silent.xml -log=bea_install.log

It takes a while… and hogs all the memory on the little windows boxes in class.

Grabbed PuTTY so my elm window would support cut & paste (Windows Telnet sucketh greatly):

Domain creation: in general, use “Express” setup, then scripting tools to modify.

The admin console is a web interface to the weblogic server (security? instructor says it is encrypted).

7001 is default port for weblogic servers.

weblogic.Admin’s THREAD_DUMP command is very handy!
java weblogic.Admin -username system -password weblogic -url THREAD_DUMP

The logging section is too slow… post-mortem equine floggery going on here… sigh.

JDBC logging has huge overhead, recommended only for debugging.

Extending with the Domain Config Wizard may be more difficult than simply using the admin console to do the same thing.

If using a boot.properties file to store username/password, restart your server so it get’s hashed with triple DES.

Day 2: new laptops IBM ThinkPad R40s with more memory and Windows 2000.
Re-install Firefox, PuTTY, and try again to install BEA crapware with silent mode. Once more a failure, using GUI/Console mode just to get through this crap.
Installed vim (vim.org) because Windows editors suck.

Lab instructions are whacked. stopweblogic script is broken.

getting a lot of interruptions from work (name_swap and member_validate problems). I’ve missed most of the JDBC stuff.

“JMS Server” is a bad term, it is really just a service configured into an existing managed WebLogic server. Same process, additional service done inline with the existing server process.

In JMS – both the consumer and producer are clients. They are peers – one just happens to provide data and the other requests it.

Restudy JDBC areas, specifically JDBC Driver types (1-4) on pages 195 – 199 (paying attention to the details NOT on the slides). Also re-cover JDBC collection pools info on pp 206-213.

Basically; JMS services and ConnectionFactory are to Web connectivity middleware as inetd.conf, /etc/services, and RCP/YP/NIS+ calls were to UNIX network services.

There is nothing new here except the Java language instead of C with RPC libraries. What a horrible shame the world can’t invent anything new.

This is actually more unmitigated BullShit than Bucky is.

Tomkat is the open source servlet container that runs within Apache to implement Java Servlets and JavaServer Pages.

.war files are archived Web Applications, archived with jar (jar is tar but for Java junkies, sigh)
.jar files are any tar archive of Java stuff
expanded directory structure can be a web application too
web.xml has the deployment descriptors (pg: 338)
a weblogic.xml is a web.xml with BEA enhancements and propretary crap (security stuff and/or Virtual Directory foo typically)

Java is always case-sensitive, so META-INF must always be upper case only. The files in the META-INF dir all private, and used by the webserver’s servlets – not visible to real web clients.

Day 3: Brad shows up!

Java Beans

atomic, consistent, isolation from other transactions, durable

Wow – excellent description of a transaction in generic terms on pages 418 – 428. Great illustrations and examples.

Transactions are configured in the JTA section of the web console, and configured at the WebLogic Domain level.

Do your user and group additions, bulk or individual, through the scripting tool or a JMX based script. This is one place that the web admin console is far too limiting.

SSL Cert Generation info, again.
Node Manager Scripts
Security Foo

Day 4:

startup and shutdown classes
another broken lab, joy
lab17 – java ridemonitor never able to connect to port
rebooting Windows does not help

Neat Windows Command Shell Trick – double click some text, hit the Function key to copy, then paste with the left mouse button.

The Grinder is a cool load testing tool.

Day 5:

Wasting time defending Java and JVM speed/performance. BEA misses the point: that Java programmers are encouraged by the Java/JVM to never learn the underlying details of their systems. The problem is that no one is turning out fast and efficient Java code. C, assembly, forth, and ada FORCE the developer to learn the system. They end up turning out more efficient code.

Page 747’s details about the Prepared Statement Cache covers the two most impacting tuneables for DB connection performance. LRU and Statement cache size.

Network Tuning details on Page 751…

Lots of good data on stuck JVM threads on pages 768-769.
Detecting Stuck Threads and then “hardening” the server are frequent tuning jobs we’ll need to do often.

“Stuck” is a WLS euphamism for working/busy with no breathers… Jobs that basically get very busy and stay that way.

Make your Java programmers learn about multiple threads and blocking states, and deadlocks… it will pay off big time. In particular, getting people to realize deadlock backoff and negotiation techniques will help a lot.

Thread monitoring and management are important to find and reduce bottlenecks early (long before production).

Second silver lining is the possibility that more and more Java/WebLogic developers will learn all about transaction programming and keeping transactions short. It could pay off big for DB backend performance in general.

MBeans == managed beans

WLShell is awesome – bash/korn shell Java scripting interface to WebLogic: http://www.wlshell.net/ – it’s like SQSH for WebLogic and Java! Cool

2005-06-21 13:38.

Notes from the Tandem iTP WebServer class.

Intro is talking about the history of the internet and the world wide web. Arpanet, Tim Berners-Lee of CERN (1989). HTTP, URL, HTML all created. First specifications released to public in 1991. A bit of rudimentory review of TCP/IP and internet IP addressing.

Three big applications for years: File Transfer (UUCP/FTP), Mail, Telnet. Web was the fourth big one.

NCSA Mosaic (first web browser) in 1991 at University of Illinois, Champaign-Urbana.


How is iTP any different than httpd from Apache?

iTP WebServer
TS/MP (Pathway)
OSS and Guardian Environments
NonStop Kernel

Each iTP server has a Distributor (listener) and httpd process, and optionally multiple cgi connections, which can be generic or pathway based. Parallel library TCP/IP implementations have no distributor/listener, but will spread the httpd listeners over multiple interfaces. Typically pathway CGI programs will translate HTTP requests to Pathsends or SQL queries.

Alternatively, the httpd/iTP server can call Servlet Server Classes (SSC) in Java (groan), which run in multiple JVMs (pcode interpreters) on the NonStop machine (expensive).

NonStop SQL/MP, NonStop TS/MP, NonStop Tuxedo all can be called from an atp file (myfile.atp) which is HTML with embedded JavaScript. ATP = Active Transaction Pages.

Still wondering what the differences between Apache’s httpd and iTP WebServer are…

Symetric versus asymetric encryption in general terms. Certificate Authority services and distributed public certs and/or keys. SSL, PCT, SET, S-HTTP, and others. SSL in detail, slide 4-16 is a wondeful diagram of how SSL works (alegedly a version is available from Netscape as well). Google “How SSL works” to find out more.

For some reason HP recommends that a non super.super user in group super runs/manages the iTP WebServer software… why? Isn’t this a security risk? Why not run it as a non-priveledged user? Say web.mgr and web.user?

Java crap gets installed in /usr/tandem/java and /usr/tandem/nssjava.

Why would you waste expensive Tandem resources running Java servlets when you can run them on Linux boxes? Save the Tandem cycles for SQL processing.

Scary – they have a default config that creates a web based administrative server with very little security… yikes.

Ouch! iTP WebServer cannot be used for any webservice requiring secure logins to the web server. While iTP server provides a rudimentary user admin and password facility, it is a bit of a joke; using no encryption for the transfer of the passwords and no real .htaccess functionality at all.

Apache WebServer 101 Class

2005-05-31 13:45

Class is off to a slow start, but we are getting O’Reilly Apache Cookbooks (the moose) – which is a good sign. Rich from Global knowledge is teaching.

Apache and Tomcat on Redhat.

1987 telnet
1990 habanero
1993 mosaic
1995 apache

Apache Software Foundation (ASF)
httpd is the name of the webserver

Apache’s httpd is for static content
Dynamic content comes from plugins (Java Servlets, ActiveX, PHP, etc)

Modules can be built into the httpd binary, or they can be loaded dynamically.

Directives tell the server when to use the module.
Each directive should be in a module that is already loaded.

MaxKeepAliveRequests 100, typically this must be much higher.

Config has:
1) comments
2) directives
3) sections

DocumentRoot is for the default webserver only, virtual hosts/domains have their own DocRoot each…

httpd -t
(parses config file to test httpd.conf syntax)
(apache -t) on Windows

vl_module has the AOL internal SNS security stuff

Wednesday: Apache bench (used for testing your server)

ab -n 100 http://heloise.office.aol.com/
(analizes permformance of your server)

ab -c 10 http://www.apple.com/

jmeter is more featureful testbed, but Java based – available on Jakarta project on the Apache website.

Internet Services has a custom binary that analizes the logs

Proxy vrs Redirect (one hides a domain as another, the other shares the info to the user)

SSI (Server Side Includes) aka: Filters with the INCLUDES builtin
AddOutputFilter INCLUDES .shtml
Options IncludesNOEXEC
AddType text/html .shtml
create an shtml file in your htdocs

Valid XML: (tag with no content)

HTACCESS passwords

htdigest -c mydigestpasswords myrealm snolan

SetHandler server-status
AuthType Digest
AuthName myrealm
AuthDigestFire “c:\Program Files\Apache Group\Apache2\bin\mydigestpwds.dat”
AuthGroup “c:\Program Files\Apache Group\Apache2\bin\mygroups.txt”
Require group managers

managers: snolan frank
customers: joe sally

(this does digest password auth, basic encryption between browser and server)

Tuesday Night

Tuesday night Erci and I celebrated the 14th anniversary of our marriage. It just does not seem like that long. We get to celebrate twice a year because we did the legal transaction about 8 months before we pulled together the big scary public ceremony. We’ve had a fabulous adventure together so far.

Fourteen years ago she was a database administrator at the Defense Intelligence Agency, actively fighting with rattan in the Society for Creative Anachronism, cooking medieval foods for her friends and household, bicycling, already involved in La Belle Compagnie, traveling to NATO (Brussels) and London quite often for work, and she had ferrets (I still miss Mudge and Newt).
Back then I was still in the United States Air Force on active duty, stationed at the Pentagon, flirting with everyone in the SCA, tinkering with my Amiga computer, running, playing occasional games of pickup football (the kind with goalies) with locals, and bicycling all over the Washington Area. She was unstoppable. She amazed me. She still does.

We’ve lived apart (the USAF sent me to freakin’ Omaha for about 7 months of enforced separation). We’ve been in tiny apartments, a small townhouse, and two very large homes. When I got out of the USAF my income jumped so much that I literally paid more in taxes for 1995 than I grossed for all of 1994! We gradually got less involved in the Society for Creative Anachronism as we got more involved with La Belle Compagnie and as we got very involved with ballroom dancing. She pulled me into La Belle, and I pulled her into Scuba diving, and Marine Aquariums. She helped me re-discover Buddhism. We’ve had several more ferrets, and now a cat and several aquariums. We’ve traveled to England, Jamaica, Mallorca, Greece, Turkey, Curacao, Cozumel, Belize, Bonaire, Japan, the Virgin Islands, Bahamas, Michigan, California, Arizona, Nevada, and New York together. She has become a chef, part-time, while continuing to be an amazing IT professional at work. We have had the opportunity to build our dream home. She is still unstoppable, and I am still amazed.

Tuesday night we went to Le Tire Bouchon, a cozy and quiet little traditional French restaurant in old Fairfax. There were ony two other couples there at that time, and one couple finished and left shortly after we got there. We had excellent food, good wine, and got to talk a lot (mostly about work, but talking about anything is fun with her).

She remains my partner, sharing the awesome adventure of life with me as an equal. Sometimes she leads, sometimes I do. We are incredibly blessed and fortunate to have many, many close friends. She’s cooking up another Windjammer cruise to someplace warm and exciting.

Roemmelt for Delegate!!!

Yippee! Bruce Roemmelt (Firefighter, educator, veteran, 2005 candidate) is running for Virginia House of Delegates for the 13th district again. As many of you know, I am very proud to have worked in Bruce’s 2005 campaign to unseat Delegate Robert Marshall (who is way too concerned with what goes on in my private life for me). Please consider helping Bruce win election this November. Visit his website, contribute, volunteer, and help get him elected.

If you live anywhere in Virginia the outcome of this election matters to you, as Delegate Marshall is the author of the hateful marriage amendment that passed last year and Delegate Marshall just tried to sneak a law onto the books that would make contraception illegal in Virginia (HB2797, thanks Kenton).

First Day Back at Work

Today was my first full day back at work, and it was grueling. My heros are co-workers Peter and Uwe who brilliantly covered the most critical projects while I was out and even managed to make progress on them. I was really tired after a day of work and another follow-up with the doctor. Turns out I already have little post-surgical polyps growing back in (surprised the doctor a little). She pulled a few out, OUCH! I see her again next week where she’ll pull them out for real. I hope this is not the beginning of a new trend.

Ready for sleep now.