Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 3 days 18 hours ago

Real time replication from Oracle to PostgreSQL using Data Replicator from DBPLUS

Mon, 2019-12-02 07:15

I’ve done quite some real time logical replication projects in the past, either using Oracle Golden Gate or EDB replication server. Build in logical replication in PostgreSQL (which is available since PostgreSQL 10) can be used as well when both, the source and the target are PostgreSQL instances. While being at the DOAG conference and exhibition 2019 I got in contact with people from DBPLUS and they provide a product which is called “Data Replicator”. The interesting use case for me is the real time replication from Oracle to PostgreSQL as the next project for such a setup is already in the pipe so I thought I’ll give it try.

The “Data Replicator” software needs to be installed on a Windows machine and all traffic will go through that machine. The following picture is stolen from the official “Data Replicator” documentation and it pretty well describes the architecture when the source system is Oracle:

As “Data Replicator” will use Oracle LogMiner, no triggers need to be installed on the source system. Installing something on a validated system might become tricky so this already is a huge benefit compared to some other solutions, e.g. SymmetricDS. When you know GoldenGate the overall architecture is not so much different: What GoldeGate calls the extract is the “Reader” in Data Replicator and the replicat becomes the “Applier”.

The installation on the Windows machine is so simple, that I’ll just be providing the screenshots without any further comments:





In the background three new services have been created and started by the installation program:

There is the replication manager which is responsible for creating replication processes. And then there are two more services for reading from source and writing data to the target. In addition the graphical user interface was installed (which could also be running on another windows machine) which looks like this once you start it up:

Before connecting with the GUI you should do the basic configuration by using the “DBPLUS Replication Manager Configuration” utility:

Once that is done you can go back to the client and connect:

The initial screen has not much content, except for the possibility to create a new replication and I really like that: No overloaded, very hard to initially understand interface but easy and tidy. With only one choice it is easy to go forward so lets create a new replication:

Some concept here: Very clean interface, only 5 steps to follow. My source system is Oracle 19.3 EE and all I have to do is to provide the connection parameters, admin user and a new user/password combination I want to us for the logical replication:

Asking “Data Replicator” to create the replication user, and all is fine:

SQL> r
  1* select username,profile from dba_users where username = 'REPLUSR'

USERNAME                       PROFILE
------------------------------ ------------------------------
REPLUSR                        DEFAULT

Of course some system privileges have been granted to the user that got created:

SQL> select privilege from dba_sys_privs where grantee = 'REPLUSR';

PRIVILEGE
----------------------------------------
SELECT ANY TRANSACTION
LOGMINING
SELECT ANY DICTIONARY
SELECT ANY TABLE

Proceeding with the target database, which is PostgreSQL 12.1 in my case:

As you can see there is no option to create a user on the target. What I did is this:

postgres=# create user replusr with login password 'xxxxxxx';
CREATE ROLE
postgres=# create database offloadoracle with owner = 'replusr';
CREATE DATABASE
postgres=# 

Once done, the connection succeeds and can be saved:

That’s all for the first step and we can proceed to step two:

I have installed the Oracle sample schemas for this little demo and as I only want to replicate these I’ve changed the selection to “REPLICATE ONLY SELECTED SCHEMAS AND TABLES”.

Once more this is all that needs to be done and the next step would be to generate the report for getting an idea of possible issues:

The reported issues totally make sense and you even get the commands to fix it, except for the complaints about the unique keys, of course (If you go for logical replication you should anyway make sure that each table either contains a primary key or at last a unique key). Once the Oracle database is in archive mode and supplemental log data was added the screen will look fine (I will ignore the two warnings as they are not important for this demo):

The next step is to define the “Start Options” and when you select “automatic” you’ll have to specify the options for the transfer server:

There is a small configuration utility for that as well:

When you are happy with it, provide the details in the previous screen and complete the replication setup by providing a name in the last step:

That’s all you need to do and the replication is ready to be started:

… and then it immediately fails because we do not have a valid license. For getting a trial license you need to provide the computer ID which can be found in the information section:

Provide that to DBPLUS and request a trial license. Usually they are responding very fast:

Starting the replication once more:

You’ll see new processes on the PostgreSQL side:

postgres@centos8pg:/home/postgres/ [121] ps -ef | grep postgres
root      1248   769  0 12:58 ?        00:00:00 sshd: postgres [priv]
postgres  1252     1  0 12:58 ?        00:00:00 /usr/lib/systemd/systemd --user
postgres  1256  1252  0 12:58 ?        00:00:00 (sd-pam)
postgres  1262  1248  0 12:58 ?        00:00:00 sshd: postgres@pts/0
postgres  1263  1262  0 12:58 pts/0    00:00:00 -bash
postgres  1667     1  0 12:58 ?        00:00:00 /u01/app/postgres/product/12/db_0/bin/postgres -D /u02/pgdata/12
postgres  1669  1667  0 12:58 ?        00:00:00 postgres: checkpointer   
postgres  1670  1667  0 12:58 ?        00:00:00 postgres: background writer   
postgres  1671  1667  0 12:58 ?        00:00:00 postgres: walwriter   
postgres  1672  1667  0 12:58 ?        00:00:00 postgres: autovacuum launcher   
postgres  1673  1667  0 12:58 ?        00:00:00 postgres: stats collector   
postgres  1674  1667  0 12:58 ?        00:00:00 postgres: logical replication launcher   
postgres  2560  1667  0 14:40 ?        00:00:00 postgres: replusr offloadoracle 192.168.22.1(40790) idle
postgres  2562  1667  0 14:40 ?        00:00:00 postgres: replusr offloadoracle 192.168.22.1(40800) idle
postgres  2588  1263  0 14:40 pts/0    00:00:00 ps -ef
postgres  2589  1263  0 14:40 pts/0    00:00:00 grep --color=auto postgres

… and you’ll see LogMiner proceses on the Oracle side:

LOGMINER: summary for session# = 2147710977
LOGMINER: StartScn: 2261972 (0x00000000002283d4)
LOGMINER: EndScn: 18446744073709551615 (0xffffffffffffffff)
LOGMINER: HighConsumedScn: 0
LOGMINER: PSR flags: 0x0
LOGMINER: Session Flags: 0x4000441
LOGMINER: Session Flags2: 0x0
LOGMINER: Read buffers: 4
LOGMINER: Region Queue size: 256
LOGMINER: Redo Queue size: 4096
LOGMINER: Memory LWM: limit 10M, LWM 12M, 80%
LOGMINER: Memory Release Limit: 0M
LOGMINER: Max Decomp Region Memory: 1M
LOGMINER: Transaction Queue Size: 1024
2019-11-22T14:05:54.735533+01:00
LOGMINER: Begin mining logfile for session -2147256319 thread 1 sequence 8, /u01/app/oracle/oradata/DB1/onlinelog/o1_mf_2_gxh8fbhr_.log
2019-11-22T14:05:54.759197+01:00
LOGMINER: End   mining logfile for session -2147256319 thread 1 sequence 8, /u01/app/oracle/oradata/DB1/onlinelog/o1_mf_2_gxh8fbhr_.log

In the details tab there is more information about what is currently going on:

Although it looked quite good at the beginning there is the first issue:

Oracle data type is unknown: OE.CUST_ADDRESS_TYP
Stack trace:
System.ArgumentException: Oracle data type is unknown: OE.CUST_ADDRESS_TYP
   at DbPlus.DataTypes.Oracle.OracleDataTypes.Get(String name)
   at DbPlus.Replicator.Tracking.TableCopyParameterCreator.c__DisplayClass22_0.g__MapSourceColumnType|1(TableColumn sourceColumn, String targetColumnName)
   at DbPlus.Replicator.Tracking.TableCopyParameterCreator.c__DisplayClass25_0.g__GetColumnMapping|4(TableColumn sourceColumn)
   at System.Linq.Enumerable.WhereSelectEnumerableIterator`2.MoveNext()
   at System.Linq.Enumerable.WhereEnumerableIterator`1.MoveNext()
   at System.Linq.Buffer`1..ctor(IEnumerable`1 source)
   at System.Linq.Enumerable.ToArray[TSource](IEnumerable`1 source)
   at DbPlus.Replicator.Tracking.TableCopyParameterCreator.c__DisplayClass25_0.b__5()
   at DbPlus.Replicator.Alerts.AsyncTransientErrorHandler.Execute[T](Func`1 operation)
   at DbPlus.Replicator.Tracking.TableCopyParameterCreator.GetTableCopyParameters(ReplicatedTable sourceTable)
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.b__61_2(ReplicatedTable table)
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.ExecuteOneWithTableLock(Func`1 source, Action`1 operation, Nullable`1 timeout)
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.ExecuteAllWithTableLock(Func`1 source, Action`1 operation, Nullable`1 timeout)
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.b__61_0()
   at DbPlus.Replicator.Alerts.AsyncTransientErrorHandler.Block(Action action)
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.StartDataTransfer()
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.ProcessRemoteOperations()
   at DbPlus.Tasks.Patterns.TaskTemplates.c__DisplayClass0_0.<g__Run|0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at DbPlus.Tasks.Patterns.TaskGroup.Run(CancellationToken cancellationToken)
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.Run()
   at DbPlus.Replicator.ComponentModel.Component.RunInternal()

As with all logical replication solutions custom types are tricky and usually not supported. What I will be doing now is to replicate the “HR” and “SH” schemas only, which do not contain any custom type:

Once again, starting the replication, next issue:

Oracle data type is unknown: ROWID
Stack trace:
System.ArgumentException: Oracle data type is unknown: ROWID
   at DbPlus.DataTypes.Oracle.OracleDataTypes.Get(String name)
   at DbPlus.Replicator.Tracking.TableCopyParameterCreator.c__DisplayClass22_0.g__MapSourceColumnType|1(TableColumn sourceColumn, String targetColumnName)
   at DbPlus.Replicator.Tracking.TableCopyParameterCreator.c__DisplayClass25_0.g__GetColumnMapping|4(TableColumn sourceColumn)
   at System.Linq.Enumerable.WhereSelectEnumerableIterator`2.MoveNext()
   at System.Linq.Enumerable.WhereEnumerableIterator`1.MoveNext()
   at System.Linq.Buffer`1..ctor(IEnumerable`1 source)
   at System.Linq.Enumerable.ToArray[TSource](IEnumerable`1 source)
   at DbPlus.Replicator.Tracking.TableCopyParameterCreator.c__DisplayClass25_0.b__5()
   at DbPlus.Replicator.Alerts.AsyncTransientErrorHandler.Execute[T](Func`1 operation)
   at DbPlus.Replicator.Tracking.TableCopyParameterCreator.GetTableCopyParameters(ReplicatedTable sourceTable)
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.b__61_2(ReplicatedTable table)
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.ExecuteOneWithTableLock(Func`1 source, Action`1 operation, Nullable`1 timeout)
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.ExecuteAllWithTableLock(Func`1 source, Action`1 operation, Nullable`1 timeout)
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.b__61_0()
   at DbPlus.Replicator.Alerts.AsyncTransientErrorHandler.Block(Action action)
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.StartDataTransfer()
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.ProcessRemoteOperations()
   at DbPlus.Tasks.Patterns.TaskTemplates.c__DisplayClass0_0.<g__Run|0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at DbPlus.Tasks.Patterns.TaskGroup.Run(CancellationToken cancellationToken)
   at DbPlus.Replicator.Tracking.ReplicatedTablesTracker.Run()
   at DbPlus.Replicator.ComponentModel.Component.RunInternal()

Lets check which column(s) and table(s) that is/are:

SQL> SELECT owner, table_name, column_name from dba_tab_columns where data_type = 'ROWID' and owner in ('HR','SH');

OWNER                TABLE_NAME                     COLUMN_NAME
-------------------- ------------------------------ ------------------------------
SH                   DR$SUP_TEXT_IDX$U              RID
SH                   DR$SUP_TEXT_IDX$K              TEXTKEY

Such columns can be easily excluded:



Starting over again, next issue:

At least the schemas need to exist on the target, so:

postgres=# \c offloadoracle postgres
You are now connected to database "offloadoracle" as user "postgres".
offloadoracle=# create schema sh;
CREATE SCHEMA
offloadoracle=# create schema hr;
CREATE SCHEMA
offloadoracle=# 

Next try:

On the source side:

SQL> grant flashback any table to REPLUSR;

Grant succeeded.

SQL> 

On the target side:

offloadoracle=# grant all on schema hr to replusr;
GRANT
offloadoracle=# grant all on schema sh to replusr;
GRANT

Finally most of the tables are replicating fine now:

There are a few warnings about missing unique keys and some tables can not be replicated at all:

For now I am just going to exclude the failed tables as this is fine for the scope of this post:

… an my replication is fine. A quick check on the target:

offloadoracle=# select * from sh.products limit 3;
 prod_id |           prod_name           |           prod_desc           | prod_subcategory | prod_subcategory_id | prod_subcategory_desc |        prod_category        | prod_category_id |     prod_category_desc      | prod_weight_class | prod_unit_of_measure | prod_pac>
---------+-------------------------------+-------------------------------+------------------+---------------------+-----------------------+-----------------------------+------------------+-----------------------------+-------------------+----------------------+--------->
      13 | 5MP Telephoto Digital Camera  | 5MP Telephoto Digital Camera  | Cameras          |     2044.0000000000 | Cameras               | Photo                       |   204.0000000000 | Photo                       |                 1 | U                    | P       >
      14 | 17" LCD w/built-in HDTV Tuner | 17" LCD w/built-in HDTV Tuner | Monitors         |     2035.0000000000 | Monitors              | Peripherals and Accessories |   203.0000000000 | Peripherals and Accessories |                 1 | U                    | P       >
      15 | Envoy 256MB - 40GB            | Envoy 256MB - 40Gb            | Desktop PCs      |     2021.0000000000 | Desktop PCs           | Hardware                    |   202.0000000000 | Hardware                    |                 1 | U                    | P       >
(3 rows)

lines 1-7/7 (END)

… confirms the data is there. As this post is already long enough here some final thoughts: The installation of “Data Replicator” is a no-brainer. I really like the simple interface and setting up a replication between Oracle and PostgreSQL is quite easy. Of course you need to know the issues you can run into with logical replication (missing unique or primary keys, not supported data types, …) but this is the same topic for all solutions. What I can say for sure is, that I never was as fast for setting up a demo replication as with “Data Replicator”. More testing to come …

Cet article Real time replication from Oracle to PostgreSQL using Data Replicator from DBPLUS est apparu en premier sur Blog dbi services.

AEM Forms – FIPS 140-2 Support

Mon, 2019-12-02 00:00

Around summer last year, one of the project I was working on started a new integration with AEM Forms for Digital Signatures and Reader Extension components. It was already using AEM Forms before but for other purposes. With this new requirement came new problems (obviously). This project was still using AEM Forms 6.4 JEE on WebLogic Server 12.2.1.3.

As mentioned a few times already, our policy is always security by default, unless the customer has some specific requirements that would prevent us to do that. Since we are usually working for critical businesses, that’s normally not a problem at all (quite the opposite). Therefore, when we install a WebLogic Server, we always set our best practices on top of it. One of these best practices is to enable the FIPS (Federal Information Processing Standards) 140-2 support and as much as possible it’s compliance. A software stack is FIPS 140-2 compliant if all its components support the FIPS 140-2 and they can all restrict their operations to FIPS 140-2 methods/transactions only. If a single piece of the stack (used) isn’t FIPS 140-2 compliant, then the whole software stack isn’t.

Alright, so why am I mentioning all that? Well, we are trying as much as possible to have fully FIPS 140-2 compliant installations and therefore we always use restricted ciphers, encryptions, protocols, aso… In this AEM Forms FIPS 140-2 compliant installation, we tried to add Digital Signatures & Reader Extensions on PDF but while doing some testing on the AEM Workbench, we encountered the following error pop-up:

FIPS 140-2 Reader Extension error

The complete error stack can be seen on the AEM Workbench logs:

ALC-DSC-003-000: com.adobe.idp.dsc.DSCInvocationException: Invocation error.
	at com.adobe.idp.dsc.component.impl.DefaultPOJOInvokerImpl.invoke(DefaultPOJOInvokerImpl.java:152)
	at com.adobe.idp.dsc.interceptor.impl.InvocationInterceptor.intercept(InvocationInterceptor.java:140)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.interceptor.impl.DocumentPassivationInterceptor.intercept(DocumentPassivationInterceptor.java:53)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor$1.doInTransaction(TransactionInterceptor.java:74)
	at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.execute(EjbTransactionCMTAdapterBean.java:357)
	at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.doSupports(EjbTransactionCMTAdapterBean.java:227)
	at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.__WL_invoke(Unknown Source)
	at weblogic.ejb.container.internal.SessionLocalMethodInvoker.invoke(SessionLocalMethodInvoker.java:33)
	at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.doSupports(Unknown Source)
	at com.adobe.idp.dsc.transaction.impl.ejb.EjbTransactionProvider.execute(EjbTransactionProvider.java:104)
	at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor.intercept(TransactionInterceptor.java:72)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.interceptor.impl.InvocationStrategyInterceptor.intercept(InvocationStrategyInterceptor.java:55)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.interceptor.impl.InvalidStateInterceptor.intercept(InvalidStateInterceptor.java:37)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.interceptor.impl.AuthorizationInterceptor.intercept(AuthorizationInterceptor.java:188)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.interceptor.impl.JMXInterceptor.intercept(JMXInterceptor.java:48)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.engine.impl.ServiceEngineImpl.invoke(ServiceEngineImpl.java:121)
	at com.adobe.idp.dsc.routing.Router.routeRequest(Router.java:131)
	at com.adobe.idp.dsc.provider.impl.base.AbstractMessageReceiver.routeMessage(AbstractMessageReceiver.java:93)
	at com.adobe.idp.dsc.provider.impl.vm.VMMessageDispatcher.doSend(VMMessageDispatcher.java:225)
	at com.adobe.idp.dsc.provider.impl.base.AbstractMessageDispatcher.send(AbstractMessageDispatcher.java:69)
	at com.adobe.idp.dsc.clientsdk.ServiceClient.invoke(ServiceClient.java:215)
	at com.adobe.workflow.engine.PEUtil.invokeAction(PEUtil.java:893)
	at com.adobe.idp.workflow.dsc.invoker.WorkflowDSCInvoker.transientInvoke(WorkflowDSCInvoker.java:356)
	at com.adobe.idp.workflow.dsc.invoker.WorkflowDSCInvoker.invoke(WorkflowDSCInvoker.java:159)
	at com.adobe.idp.dsc.interceptor.impl.InvocationInterceptor.intercept(InvocationInterceptor.java:140)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.interceptor.impl.DocumentPassivationInterceptor.intercept(DocumentPassivationInterceptor.java:53)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor$1.doInTransaction(TransactionInterceptor.java:74)
	at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.execute(EjbTransactionCMTAdapterBean.java:357)
	at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.doRequiresNew(EjbTransactionCMTAdapterBean.java:299)
	at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.__WL_invoke(Unknown Source)
	at weblogic.ejb.container.internal.SessionLocalMethodInvoker.invoke(SessionLocalMethodInvoker.java:33)
	at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.doRequiresNew(Unknown Source)
	at com.adobe.idp.dsc.transaction.impl.ejb.EjbTransactionProvider.execute(EjbTransactionProvider.java:143)
	at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor.intercept(TransactionInterceptor.java:72)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.interceptor.impl.InvocationStrategyInterceptor.intercept(InvocationStrategyInterceptor.java:55)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.interceptor.impl.InvalidStateInterceptor.intercept(InvalidStateInterceptor.java:37)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.interceptor.impl.AuthorizationInterceptor.intercept(AuthorizationInterceptor.java:188)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.interceptor.impl.JMXInterceptor.intercept(JMXInterceptor.java:48)
	at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
	at com.adobe.idp.dsc.engine.impl.ServiceEngineImpl.invoke(ServiceEngineImpl.java:121)
	at com.adobe.idp.dsc.routing.Router.routeRequest(Router.java:131)
	at com.adobe.idp.dsc.provider.impl.base.AbstractMessageReceiver.invoke(AbstractMessageReceiver.java:329)
	at com.adobe.idp.dsc.provider.impl.soap.axis.sdk.SoapSdkEndpoint.invokeCall(SoapSdkEndpoint.java:153)
	at com.adobe.idp.dsc.provider.impl.soap.axis.sdk.SoapSdkEndpoint.invoke(SoapSdkEndpoint.java:91)
	at sun.reflect.GeneratedMethodAccessor753.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.axis.providers.java.RPCProvider.invokeMethod(RPCProvider.java:397)
	at org.apache.axis.providers.java.RPCProvider.processMessage(RPCProvider.java:186)
	at org.apache.axis.providers.java.JavaProvider.invoke(JavaProvider.java:323)
	at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32)
	at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118)
	at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83)
	at org.apache.axis.handlers.soap.SOAPService.invoke(SOAPService.java:454)
	at org.apache.axis.server.AxisServer.invoke(AxisServer.java:281)
	at org.apache.axis.transport.http.AxisServlet.doPost(AxisServlet.java:699)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
	at org.apache.axis.transport.http.AxisServletBase.service(AxisServletBase.java:327)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
	at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:286)
	at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:260)
	at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:137)
	at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:350)
	at weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:25)
	at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
	at com.adobe.idp.dsc.provider.impl.soap.axis.InvocationFilter.doFilter(InvocationFilter.java:43)
	at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
	at com.adobe.idp.um.auth.filter.ParameterFilter.doFilter(ParameterFilter.java:105)
	at com.adobe.idp.um.auth.filter.CSRFFilter.invokeNextFilter(CSRFFilter.java:141)
	at com.adobe.idp.um.auth.filter.CSRFFilter.doFilter(CSRFFilter.java:132)
	at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
	at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(WebAppServletContext.java:3701)
	at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3667)
	at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:326)
	at weblogic.security.service.SecurityManager.runAsForUserCode(SecurityManager.java:197)
	at weblogic.servlet.provider.WlsSecurityProvider.runAsForUserCode(WlsSecurityProvider.java:203)
	at weblogic.servlet.provider.WlsSubjectHandle.run(WlsSubjectHandle.java:71)
	at weblogic.servlet.internal.WebAppServletContext.doSecuredExecute(WebAppServletContext.java:2443)
	at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2291)
	at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2269)
	at weblogic.servlet.internal.ServletRequestImpl.runInternal(ServletRequestImpl.java:1705)
	at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1665)
	at weblogic.servlet.provider.ContainerSupportProviderImpl$WlsRequestExecutor.run(ContainerSupportProviderImpl.java:272)
	at weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:352)
	at weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:337)
	at weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:57)
	at weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41)
	at weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:644)
	at weblogic.work.ExecuteThread.execute(ExecuteThread.java:415)
	at weblogic.work.ExecuteThread.run(ExecuteThread.java:355)
Caused by: com.adobe.livecycle.readerextensions.client.exceptions.ReaderExtensionsException: ALC-RES-001-008: Unable to apply the requested usage rights to the given document.
	at com.adobe.livecycle.readerextensions.ReaderExtensionsImplementation.applyUsageRights(ReaderExtensionsImplementation.java:125)
	at com.adobe.livecycle.readerextensions.ReaderExtensionsService.applyUsageRights(ReaderExtensionsService.java:166)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.adobe.idp.dsc.component.impl.DefaultPOJOInvokerImpl.invoke(DefaultPOJOInvokerImpl.java:118)
	... 102 more
Caused by: com.adobe.livecycle.readerextensions.client.ProcessingException: ALC-RES-001-008: Unable to apply the requested usage rights to the given document.
	... 109 more
Caused by: com.adobe.internal.pdftoolkit.core.exceptions.PDFInvalidParameterException: Exception encountered when applying the signature
	at com.adobe.internal.pdftoolkit.services.digsig.SignatureManager.applyUsageRights(SignatureManager.java:1803)
	at com.adobe.livecycle.readerextensions.ReaderExtensionsImplementation.applyUsageRights(ReaderExtensionsImplementation.java:110)
	... 108 more
Caused by: com.adobe.internal.pdftoolkit.core.exceptions.PDFSignatureException: com.adobe.idp.cryptoprovider.CryptoProviderException: Unknown Error in CryptoProvider ALC-CRP-302-002 (in the operation : sign)
 Caused By: ALC-DSS-310-048 Could not sign PKCS7 data (in the operation : sign)
  Caused By: Algorithm not allowable in FIPS140 mode: SHA1/RSA(null-1)
	at com.adobe.idp.cryptoprovider.LCPKCS7Signer.sign(LCPKCS7Signer.java:128)
	at com.adobe.internal.pdftoolkit.services.digsig.digsigframework.impl.SignatureHandlerPPKLite.writeSignatureAfterSave(SignatureHandlerPPKLite.java:816)
	at com.adobe.internal.pdftoolkit.services.digsig.impl.SigningUtils.doSigning(SigningUtils.java:801)
	at com.adobe.internal.pdftoolkit.services.digsig.SignatureManager.applyUsageRights(SignatureManager.java:1797)
	... 109 more
Caused by: com.adobe.idp.cryptoprovider.CryptoProviderException: Unknown Error in CryptoProvider ALC-CRP-302-002 (in the operation : sign)
 Caused By: ALC-DSS-310-048 Could not sign PKCS7 data (in the operation : sign)
  Caused By: Algorithm not allowable in FIPS140 mode: SHA1/RSA(null-1)
	... 113 more

 

From the AEM Forms side, the error that can be seen at the same time was:

####<Jun 18, 2018 10:14:54,073 AM UTC> <Warning> <com.adobe.idp.cryptoprovider.CryptoProviderException> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '56' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <7dc476bf-0258-4e62-96d4-e9bcf5274954-000001bd> <1529316894073> <[severity-value: 16] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <ALC-DSS-310-048 Could not sign PKCS7 data (in the operation : sign)
Caused By: Algorithm not allowable in FIPS140 mode: SHA1/RSA(null-1)>
####<Jun 18, 2018 10:14:54,074 AM UTC> <Error> <com.adobe.idp.cryptoprovider.CryptoProviderException> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '56' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <7dc476bf-0258-4e62-96d4-e9bcf5274954-000001bd> <1529316894074> <[severity-value: 8] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Unknown Error in CryptoProvider ALC-CRP-302-002 (in the operation : sign)
Caused By: ALC-DSS-310-048 Could not sign PKCS7 data (in the operation : sign)
  Caused By: Algorithm not allowable in FIPS140 mode: SHA1/RSA(null-1)>
####<Jun 18, 2018 10:14:54,079 AM UTC> <Error> <com.adobe.livecycle.readerextensions.ReaderExtensionsImplementation> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '56' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <7dc476bf-0258-4e62-96d4-e9bcf5274954-000001bd> <1529316894079> <[severity-value: 8] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <ALC-RES-001-008: Unable to apply the requested usage rights to the given document.
com.adobe.internal.pdftoolkit.core.exceptions.PDFInvalidParameterException: Exception encountered when applying the signature
        at com.adobe.internal.pdftoolkit.services.digsig.SignatureManager.applyUsageRights(SignatureManager.java:1803)
        at com.adobe.livecycle.readerextensions.ReaderExtensionsImplementation.applyUsageRights(ReaderExtensionsImplementation.java:110)
        at com.adobe.livecycle.readerextensions.ReaderExtensionsService.applyUsageRights(ReaderExtensionsService.java:166)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.adobe.idp.dsc.component.impl.DefaultPOJOInvokerImpl.invoke(DefaultPOJOInvokerImpl.java:118)
        at com.adobe.idp.dsc.interceptor.impl.InvocationInterceptor.intercept(InvocationInterceptor.java:140)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.DocumentPassivationInterceptor.intercept(DocumentPassivationInterceptor.java:53)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor$1.doInTransaction(TransactionInterceptor.java:74)
        at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.execute(EjbTransactionCMTAdapterBean.java:357)
        at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.doSupports(EjbTransactionCMTAdapterBean.java:227)
        at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.__WL_invoke(Unknown Source)
        at weblogic.ejb.container.internal.SessionLocalMethodInvoker.invoke(SessionLocalMethodInvoker.java:33)
        at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.doSupports(Unknown Source)
        at com.adobe.idp.dsc.transaction.impl.ejb.EjbTransactionProvider.execute(EjbTransactionProvider.java:104)
        at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor.intercept(TransactionInterceptor.java:72)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.InvocationStrategyInterceptor.intercept(InvocationStrategyInterceptor.java:55)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.InvalidStateInterceptor.intercept(InvalidStateInterceptor.java:37)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.AuthorizationInterceptor.intercept(AuthorizationInterceptor.java:188)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.JMXInterceptor.intercept(JMXInterceptor.java:48)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.engine.impl.ServiceEngineImpl.invoke(ServiceEngineImpl.java:121)
        at com.adobe.idp.dsc.routing.Router.routeRequest(Router.java:131)
        at com.adobe.idp.dsc.provider.impl.base.AbstractMessageReceiver.routeMessage(AbstractMessageReceiver.java:93)
        at com.adobe.idp.dsc.provider.impl.vm.VMMessageDispatcher.doSend(VMMessageDispatcher.java:198)
        at com.adobe.idp.dsc.provider.impl.base.AbstractMessageDispatcher.send(AbstractMessageDispatcher.java:69)
        at com.adobe.idp.dsc.clientsdk.ServiceClient.invoke(ServiceClient.java:215)
        at com.adobe.livecycle.readerextensions.client.ReaderExtensionsServiceClient.invoke(ReaderExtensionsServiceClient.java:58)
        at com.adobe.livecycle.readerextensions.client.ReaderExtensionsServiceClient.applyUsageRights(ReaderExtensionsServiceClient.java:105)
        at com.adobe.livecycle.readerextensions.servlet.ApplyRightsServlet.applyRights(ApplyRightsServlet.java:241)
        at com.adobe.livecycle.readerextensions.servlet.ApplyRightsServlet.doOperation(ApplyRightsServlet.java:189)
        at com.adobe.livecycle.readerextensions.servlet.ApplyRightsServlet.doPost(ApplyRightsServlet.java:80)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
        at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:286)
        at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:260)
        at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:137)
        at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:350)
        at weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:25)
        at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
        at com.adobe.idp.um.auth.filter.ParameterFilter.doFilter(ParameterFilter.java:105)
        at com.adobe.idp.um.auth.filter.CSRFFilter.invokeNextFilter(CSRFFilter.java:141)
        at com.adobe.idp.um.auth.filter.CSRFFilter.doFilter(CSRFFilter.java:132)
        at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
        at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(WebAppServletContext.java:3701)
        at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3667)
        at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:326)
        at weblogic.security.service.SecurityManager.runAsForUserCode(SecurityManager.java:197)
        at weblogic.servlet.provider.WlsSecurityProvider.runAsForUserCode(WlsSecurityProvider.java:203)
        at weblogic.servlet.provider.WlsSubjectHandle.run(WlsSubjectHandle.java:71)
        at weblogic.servlet.internal.WebAppServletContext.doSecuredExecute(WebAppServletContext.java:2443)
        at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2291)
        at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2269)
        at weblogic.servlet.internal.ServletRequestImpl.runInternal(ServletRequestImpl.java:1705)
        at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1665)
        at weblogic.servlet.provider.ContainerSupportProviderImpl$WlsRequestExecutor.run(ContainerSupportProviderImpl.java:272)
        at weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:352)
        at weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:337)
        at weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:57)
        at weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41)
        at weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:644)
        at weblogic.work.ExecuteThread.execute(ExecuteThread.java:415)
        at weblogic.work.ExecuteThread.run(ExecuteThread.java:355)
Caused By: com.adobe.internal.pdftoolkit.core.exceptions.PDFSignatureException: com.adobe.idp.cryptoprovider.CryptoProviderException: Unknown Error in CryptoProvider ALC-CRP-302-002 (in the operation : sign)
Caused By: ALC-DSS-310-048 Could not sign PKCS7 data (in the operation : sign)
  Caused By: Algorithm not allowable in FIPS140 mode: SHA1/RSA(null-1)
        at com.adobe.idp.cryptoprovider.LCPKCS7Signer.sign(LCPKCS7Signer.java:128)
        at com.adobe.internal.pdftoolkit.services.digsig.digsigframework.impl.SignatureHandlerPPKLite.writeSignatureAfterSave(SignatureHandlerPPKLite.java:816)
        at com.adobe.internal.pdftoolkit.services.digsig.impl.SigningUtils.doSigning(SigningUtils.java:801)
        at com.adobe.internal.pdftoolkit.services.digsig.SignatureManager.applyUsageRights(SignatureManager.java:1797)
        at com.adobe.livecycle.readerextensions.ReaderExtensionsImplementation.applyUsageRights(ReaderExtensionsImplementation.java:110)
        at com.adobe.livecycle.readerextensions.ReaderExtensionsService.applyUsageRights(ReaderExtensionsService.java:166)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.adobe.idp.dsc.component.impl.DefaultPOJOInvokerImpl.invoke(DefaultPOJOInvokerImpl.java:118)
        at com.adobe.idp.dsc.interceptor.impl.InvocationInterceptor.intercept(InvocationInterceptor.java:140)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.DocumentPassivationInterceptor.intercept(DocumentPassivationInterceptor.java:53)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor$1.doInTransaction(TransactionInterceptor.java:74)
        at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.execute(EjbTransactionCMTAdapterBean.java:357)
        at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.doSupports(EjbTransactionCMTAdapterBean.java:227)
        at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.__WL_invoke(Unknown Source)
        at weblogic.ejb.container.internal.SessionLocalMethodInvoker.invoke(SessionLocalMethodInvoker.java:33)
        at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.doSupports(Unknown Source)
        at com.adobe.idp.dsc.transaction.impl.ejb.EjbTransactionProvider.execute(EjbTransactionProvider.java:104)
        at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor.intercept(TransactionInterceptor.java:72)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.InvocationStrategyInterceptor.intercept(InvocationStrategyInterceptor.java:55)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.InvalidStateInterceptor.intercept(InvalidStateInterceptor.java:37)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.AuthorizationInterceptor.intercept(AuthorizationInterceptor.java:188)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.JMXInterceptor.intercept(JMXInterceptor.java:48)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.engine.impl.ServiceEngineImpl.invoke(ServiceEngineImpl.java:121)
        at com.adobe.idp.dsc.routing.Router.routeRequest(Router.java:131)
        at com.adobe.idp.dsc.provider.impl.base.AbstractMessageReceiver.routeMessage(AbstractMessageReceiver.java:93)
        at com.adobe.idp.dsc.provider.impl.vm.VMMessageDispatcher.doSend(VMMessageDispatcher.java:198)
        at com.adobe.idp.dsc.provider.impl.base.AbstractMessageDispatcher.send(AbstractMessageDispatcher.java:69)
        at com.adobe.idp.dsc.clientsdk.ServiceClient.invoke(ServiceClient.java:215)
        at com.adobe.livecycle.readerextensions.client.ReaderExtensionsServiceClient.invoke(ReaderExtensionsServiceClient.java:58)
        at com.adobe.livecycle.readerextensions.client.ReaderExtensionsServiceClient.applyUsageRights(ReaderExtensionsServiceClient.java:105)
        at com.adobe.livecycle.readerextensions.servlet.ApplyRightsServlet.applyRights(ApplyRightsServlet.java:241)
        at com.adobe.livecycle.readerextensions.servlet.ApplyRightsServlet.doOperation(ApplyRightsServlet.java:189)
        at com.adobe.livecycle.readerextensions.servlet.ApplyRightsServlet.doPost(ApplyRightsServlet.java:80)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
        at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:286)
        at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:260)
        at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:137)
        at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:350)
        at weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:25)
        at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
        at com.adobe.idp.um.auth.filter.ParameterFilter.doFilter(ParameterFilter.java:105)
        at com.adobe.idp.um.auth.filter.CSRFFilter.invokeNextFilter(CSRFFilter.java:141)
        at com.adobe.idp.um.auth.filter.CSRFFilter.doFilter(CSRFFilter.java:132)
        at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
        at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(WebAppServletContext.java:3701)
        at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3667)
        at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:326)
        at weblogic.security.service.SecurityManager.runAsForUserCode(SecurityManager.java:197)
        at weblogic.servlet.provider.WlsSecurityProvider.runAsForUserCode(WlsSecurityProvider.java:203)
        at weblogic.servlet.provider.WlsSubjectHandle.run(WlsSubjectHandle.java:71)
        at weblogic.servlet.internal.WebAppServletContext.doSecuredExecute(WebAppServletContext.java:2443)
        at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2291)
        at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2269)
        at weblogic.servlet.internal.ServletRequestImpl.runInternal(ServletRequestImpl.java:1705)
        at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1665)
        at weblogic.servlet.provider.ContainerSupportProviderImpl$WlsRequestExecutor.run(ContainerSupportProviderImpl.java:272)
        at weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:352)
        at weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:337)
        at weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:57)
        at weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41)
        at weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:644)
        at weblogic.work.ExecuteThread.execute(ExecuteThread.java:415)
        at weblogic.work.ExecuteThread.run(ExecuteThread.java:355)
Caused By: com.adobe.idp.cryptoprovider.CryptoProviderException: Unknown Error in CryptoProvider ALC-CRP-302-002 (in the operation : sign)
Caused By: ALC-DSS-310-048 Could not sign PKCS7 data (in the operation : sign)
  Caused By: Algorithm not allowable in FIPS140 mode: SHA1/RSA(null-1)
        at com.adobe.idp.cryptoprovider.LCPKCS7Signer.sign(LCPKCS7Signer.java:128)
        at com.adobe.internal.pdftoolkit.services.digsig.digsigframework.impl.SignatureHandlerPPKLite.writeSignatureAfterSave(SignatureHandlerPPKLite.java:816)
        at com.adobe.internal.pdftoolkit.services.digsig.impl.SigningUtils.doSigning(SigningUtils.java:801)
        at com.adobe.internal.pdftoolkit.services.digsig.SignatureManager.applyUsageRights(SignatureManager.java:1797)
        at com.adobe.livecycle.readerextensions.ReaderExtensionsImplementation.applyUsageRights(ReaderExtensionsImplementation.java:110)
        at com.adobe.livecycle.readerextensions.ReaderExtensionsService.applyUsageRights(ReaderExtensionsService.java:166)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.adobe.idp.dsc.component.impl.DefaultPOJOInvokerImpl.invoke(DefaultPOJOInvokerImpl.java:118)
        at com.adobe.idp.dsc.interceptor.impl.InvocationInterceptor.intercept(InvocationInterceptor.java:140)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.DocumentPassivationInterceptor.intercept(DocumentPassivationInterceptor.java:53)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor$1.doInTransaction(TransactionInterceptor.java:74)
        at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.execute(EjbTransactionCMTAdapterBean.java:357)
        at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.doSupports(EjbTransactionCMTAdapterBean.java:227)
        at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.__WL_invoke(Unknown Source)
        at weblogic.ejb.container.internal.SessionLocalMethodInvoker.invoke(SessionLocalMethodInvoker.java:33)
        at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.doSupports(Unknown Source)
        at com.adobe.idp.dsc.transaction.impl.ejb.EjbTransactionProvider.execute(EjbTransactionProvider.java:104)
        at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor.intercept(TransactionInterceptor.java:72)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.InvocationStrategyInterceptor.intercept(InvocationStrategyInterceptor.java:55)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.InvalidStateInterceptor.intercept(InvalidStateInterceptor.java:37)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.AuthorizationInterceptor.intercept(AuthorizationInterceptor.java:188)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.interceptor.impl.JMXInterceptor.intercept(JMXInterceptor.java:48)
        at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
        at com.adobe.idp.dsc.engine.impl.ServiceEngineImpl.invoke(ServiceEngineImpl.java:121)
        at com.adobe.idp.dsc.routing.Router.routeRequest(Router.java:131)
        at com.adobe.idp.dsc.provider.impl.base.AbstractMessageReceiver.routeMessage(AbstractMessageReceiver.java:93)
        at com.adobe.idp.dsc.provider.impl.vm.VMMessageDispatcher.doSend(VMMessageDispatcher.java:198)
        at com.adobe.idp.dsc.provider.impl.base.AbstractMessageDispatcher.send(AbstractMessageDispatcher.java:69)
        at com.adobe.idp.dsc.clientsdk.ServiceClient.invoke(ServiceClient.java:215)
        at com.adobe.livecycle.readerextensions.client.ReaderExtensionsServiceClient.invoke(ReaderExtensionsServiceClient.java:58)
        at com.adobe.livecycle.readerextensions.client.ReaderExtensionsServiceClient.applyUsageRights(ReaderExtensionsServiceClient.java:105)
        at com.adobe.livecycle.readerextensions.servlet.ApplyRightsServlet.applyRights(ApplyRightsServlet.java:241)
        at com.adobe.livecycle.readerextensions.servlet.ApplyRightsServlet.doOperation(ApplyRightsServlet.java:189)
        at com.adobe.livecycle.readerextensions.servlet.ApplyRightsServlet.doPost(ApplyRightsServlet.java:80)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
        at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:286)
        at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:260)
        at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:137)
        at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:350)
        at weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:25)
        at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
        at com.adobe.idp.um.auth.filter.ParameterFilter.doFilter(ParameterFilter.java:105)
        at com.adobe.idp.um.auth.filter.CSRFFilter.invokeNextFilter(CSRFFilter.java:141)
        at com.adobe.idp.um.auth.filter.CSRFFilter.doFilter(CSRFFilter.java:132)
        at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
        at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(WebAppServletContext.java:3701)
        at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3667)
        at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:326)
        at weblogic.security.service.SecurityManager.runAsForUserCode(SecurityManager.java:197)
        at weblogic.servlet.provider.WlsSecurityProvider.runAsForUserCode(WlsSecurityProvider.java:203)
        at weblogic.servlet.provider.WlsSubjectHandle.run(WlsSubjectHandle.java:71)
        at weblogic.servlet.internal.WebAppServletContext.doSecuredExecute(WebAppServletContext.java:2443)
        at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2291)
        at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2269)
        at weblogic.servlet.internal.ServletRequestImpl.runInternal(ServletRequestImpl.java:1705)
        at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1665)
        at weblogic.servlet.provider.ContainerSupportProviderImpl$WlsRequestExecutor.run(ContainerSupportProviderImpl.java:272)
        at weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:352)
        at weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:337)
        at weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:57)
        at weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41)
        at weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:644)
        at weblogic.work.ExecuteThread.execute(ExecuteThread.java:415)
        at weblogic.work.ExecuteThread.run(ExecuteThread.java:355)
>

 

Based on all these logs, it is pretty clear that the issue was that a specific component of the AEM Forms signature process was trying to use a non-FIPS 140-2 supported algorithm. Since our WebLogic Servers are restricting this kind of weak algorithms, then the method failed on the AEM Forms server side, which was propagated on the AEM Workbench obviously.

According to the Adobe documentation, the AEM Forms is supposed to be FIPS compliant. Globally, it seems to support the FIPS but for this specific piece, it’s doesn’t and therefore the AEM Forms isn’t FIPS 140-2 compliant. Since there is nothing we can do on our side to change that, we opened a SR with the Adobe Support (#160202). After almost three months trying to explain the situation and our requirements as well as investigating the issue on Adobe side, they finally found the piece of code that was still using SHA-1.

Adobe then started a PoC to change this and have a fully (hopefully) FIPS 140-2 compliant software. The fix was tested and validated in November 2018 and it was therefore included in the next release: AEM Forms 6.4.3 (6.4 SP3) published on December 20, 2018. We installed it day-1 since it was released to fix our requirement and after that, the issue was indeed gone. Therefore, if you need to work with AEM Forms in a FIPS compliant environment, you should work with an AEM version released after that date.

 

Cet article AEM Forms – FIPS 140-2 Support est apparu en premier sur Blog dbi services.

Documentum – FQDN Validation on RCS/CFS

Sun, 2019-12-01 04:00

In a previous blog, I talked about the possible usage of K8s Services in place of the default headless/pod name and the issues that it brings. This one can be seen as a continuation since it is also related to the usage of K8s Services to install Documentum but this time with another issue that is specific to a RCS/CFS. This issue & solution might be interesting for you, even if you aren’t using K8s.

As mentioned in this previous blog, the installation of a Primary CS using K8s Services is possible but it might bring you some trouble with a few repository objects. To go further with the testing, without fixing the issues on the first CS, we tried to install a RCS/CFS (second CS for the High Availability) with the exact same parameters. As a reminder, this is what has been used:

  • Primary Content Server:
    • headless/pod: documentum-server-0.documentum-server.dbi-ns01.svc.cluster.local
    • K8s Service: cs01.dbi-ns01.svc.cluster.local
  • Remote Content Server:
    • headless/pod: documentum-server-1.documentum-server.dbi-ns01.svc.cluster.local
    • K8s Service: cs02.dbi-ns01.svc.cluster.local
  • Repository & Service: gr_repo

Therefore, the Repository silent properties file contained the following on this second CS:

[dmadmin@documentum-server-1 ~]$ grep -E "FQDN|HOST" RCS_Docbase_Global.properties
SERVER.FQDN=cs02.dbi-ns01.svc.cluster.local
SERVER.REPOSITORY_HOSTNAME=cs01.dbi-ns01.svc.cluster.local
SERVER.PRIMARY_CONNECTION_BROKER_HOST=cs01.dbi-ns01.svc.cluster.local
SERVER.PROJECTED_CONNECTION_BROKER_HOST=cs02.dbi-ns01.svc.cluster.local
SERVER.PROJECTED_DOCBROKER_HOST_OTHER=cs01.dbi-ns01.svc.cluster.local
[dmadmin@documentum-server-1 ~]$

 

I started the silent installation of the Repository and after a few seconds, the installer exited. Obviously, it means that something went wrong. Checking at the installation logs:

[dmadmin@documentum-server-1 ~]$ cd $DM_HOME/install/logs
[dmadmin@documentum-server-1 logs]$ cat install.log
13:42:26,225  INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - The product name is: CfsConfigurator
13:42:26,225  INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - The product version is: 16.4.0000.0248
13:42:26,225  INFO [main]  -
13:42:26,308  INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - Done InitializeSharedLibrary ...
13:42:26,332  INFO [main] com.documentum.install.multinode.cfs.installanywhere.actions.DiWAServerCfsInitializeImportantServerVariables - The installer is gathering system configuration information.
13:42:26,349  INFO [main] com.documentum.install.server.installanywhere.actions.DiWASilentRemoteServerValidation - Start to verify the password
13:42:29,357  INFO [main] com.documentum.install.server.installanywhere.actions.DiWASilentRemoteServerValidation - FQDN is invalid
13:42:29,359 ERROR [main] com.documentum.install.server.installanywhere.actions.DiWASilentRemoteServerValidation - Fail to reach the computer with the FQDN "cs02.dbi-ns01.svc.cluster.local". Check the value you specified. Click Yes to ignore this error, or click No to re-enter the FQDN.
com.documentum.install.shared.common.error.DiException: Fail to reach the computer with the FQDN "cs02.dbi-ns01.svc.cluster.local". Check the value you specified. Click Yes to ignore this error, or click No to re-enter the FQDN.
        at com.documentum.install.server.installanywhere.actions.DiWASilentRemoteServerValidation.setup(DiWASilentRemoteServerValidation.java:64)
        at com.documentum.install.shared.installanywhere.actions.InstallWizardAction.install(InstallWizardAction.java:73)
        at com.zerog.ia.installer.actions.CustomAction.installSelf(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.an(Unknown Source)
        at com.zerog.ia.installer.ConsoleBasedAAMgr.ac(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.am(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.runNextInstallPiece(Unknown Source)
        at com.zerog.ia.installer.ConsoleBasedAAMgr.ac(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.am(Unknown Source)
        ...
        at com.zerog.ia.installer.AAMgrBase.runNextInstallPiece(Unknown Source)
        at com.zerog.ia.installer.ConsoleBasedAAMgr.ac(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.am(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.runNextInstallPiece(Unknown Source)
        at com.zerog.ia.installer.ConsoleBasedAAMgr.ac(Unknown Source)
        at com.zerog.ia.installer.AAMgrBase.runPreInstall(Unknown Source)
        at com.zerog.ia.installer.LifeCycleManager.consoleInstallMain(Unknown Source)
        at com.zerog.ia.installer.LifeCycleManager.executeApplication(Unknown Source)
        at com.zerog.ia.installer.Main.main(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.zerog.lax.LAX.launch(Unknown Source)
        at com.zerog.lax.LAX.main(Unknown Source)
[dmadmin@documentum-server-1 logs]$

 

On the Primary CS, the installation using the K8s Service went smoothly without error but on the Remote CS with the exact same setup, it failed with the message: ‘Fail to reach the computer with the FQDN “cs02.dbi-ns01.svc.cluster.local”. Check the value you specified. Click Yes to ignore this error, or click No to re-enter the FQDN.‘. So the installer binaries behave differently if it’s a PCS or a RCS/CFS. Another funny thing is the message that says ‘Click Yes to ignore this error, or click No to re-enter the FQDN‘… That’s obviously a GUI message that is being printed to the logs but fortunately, the silent installer isn’t just waiting for an input that will never come.

I assumed that this had something to do with the K8s Services and some kind of network/hostname validation that the RCS/CFS installer is trying to do (which isn’t done on the Primary). Therefore, I tried a few things like checking the nslookup & ping, validating that the docbroker is responding:

[dmadmin@documentum-server-1 logs]$ nslookup cs01.dbi-ns01.svc.cluster.local
Server: 1.1.1.10
Address: 1.1.1.10#53

Name: cs01.dbi-ns01.svc.cluster.local
Address: 1.1.1.100
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$ ping cs01.dbi-ns01.svc.cluster.local
PING cs01.dbi-ns01.svc.cluster.local (1.1.1.100) 56(84) bytes of data.
^C
--- cs01.dbi-ns01.svc.cluster.local ping statistics ---
12 packets transmitted, 0 received, 100% packet loss, time 10999ms
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$ dmqdocbroker -t cs01.dbi-ns01.svc.cluster.local -p 1489 -c ping
dmqdocbroker: A DocBroker Query Tool
dmqdocbroker: Documentum Client Library Version: 16.4.0110.0058
Using specified port: 1489
Successful reply from docbroker at host (documentum-server-0) on port(1490) running software version (16.4.0110.0167  Linux64).
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$ nslookup cs02.dbi-ns01.svc.cluster.local
Server: 1.1.1.10
Address: 1.1.1.10#53

Name: cs02.dbi-ns01.svc.cluster.local
Address: 1.1.1.200
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$ ping cs02.dbi-ns01.svc.cluster.local
PING cs02.dbi-ns01.svc.cluster.local (1.1.1.200) 56(84) bytes of data.
^C
--- cs02.dbi-ns01.svc.cluster.local ping statistics ---
12 packets transmitted, 0 received, 100% packet loss, time 10999ms
[dmadmin@documentum-server-1 logs]$
[dmadmin@documentum-server-1 logs]$ dmqdocbroker -t cs02.dbi-ns01.svc.cluster.local -p 1489 -c ping
dmqdocbroker: A DocBroker Query Tool
dmqdocbroker: Documentum Client Library Version: 16.4.0110.0058
Using specified port: 1489
Successful reply from docbroker at host (documentum-server-1) on port(1490) running software version (16.4.0110.0167  Linux64).
[dmadmin@documentum-server-1 logs]$

 

As you can see above, same result for the Primary CS and the Remote one. The only thing not responding is the ping but that’s because it’s a K8s Service… At this point, I assumed that the RCS/CFS installer is trying to do something like a ping which fails and therefore the error on the log and the stop of the installer. To validate that, I simply updated a little bit the file /etc/hosts (as root obviously):

[root@documentum-server-1 ~]$ cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1       localhost ip6-localhost ip6-loopback
fe00::0   ip6-localnet
fe00::0   ip6-mcastprefix
fe00::1   ip6-allnodes
fe00::2   ip6-allrouters
1.1.1.200  documentum-server-1.documentum-server.dbi-ns01.svc.cluster.local  documentum-server-1
[root@documentum-server-1 ~]$
[root@documentum-server-1 ~]$ echo '1.1.1.200  cs02.dbi-ns01.svc.cluster.local' >> /etc/hosts
[root@documentum-server-1 ~]$
[root@documentum-server-1 ~]$ cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1       localhost ip6-localhost ip6-loopback
fe00::0   ip6-localnet
fe00::0   ip6-mcastprefix
fe00::1   ip6-allnodes
fe00::2   ip6-allrouters
1.1.1.200  documentum-server-1.documentum-server.dbi-ns01.svc.cluster.local  documentum-server-1
1.1.1.200  cs02.dbi-ns01.svc.cluster.local
[root@documentum-server-1 ~]$

 

After doing that, I tried again to start the RCS/CFS installer in silent (exact same command, no changes to the properties file and this time, it was able to complete the installation without issue.

[dmadmin@documentum-server-1 ~]$ cd $DM_HOME/install/logs
[dmadmin@documentum-server-1 logs]$ cat install.log
14:01:33,199 INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - The product name is: CfsConfigurator
14:01:33,199 INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - The product version is: 16.4.0000.0248
14:01:33,199 INFO [main] -
14:01:33,247 INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - Done InitializeSharedLibrary ...
14:01:33,278 INFO [main] com.documentum.install.multinode.cfs.installanywhere.actions.DiWAServerCfsInitializeImportantServerVariables - The installer is gathering system configuration information.
14:01:33,296 INFO [main] com.documentum.install.server.installanywhere.actions.DiWASilentRemoteServerValidation - Start to verify the password
14:01:33,906 INFO [main] com.documentum.fc.client.security.impl.JKSKeystoreUtilForDfc - keystore file name is /tmp/089972.tmp/dfc.keystore
14:01:34,394 INFO [main] com.documentum.fc.client.security.internal.CreateIdentityCredential$MultiFormatPKIKeyPair - generated RSA (2,048-bit strength) mutiformat key pair in 468 ms
14:01:34,428 INFO [main] com.documentum.fc.client.security.internal.CreateIdentityCredential - certificate created for DFC <CN=dfc_MlM5tLi5T9u1r82AdbulKv14vr8a,O=EMC,OU=Documentum> valid from Tue Sep 10 13:56:33 UTC 2019 to Fri Sep 07 14:01:33 UTC 2029:
14:01:34,429 INFO [main] com.documentum.fc.client.security.impl.JKSKeystoreUtilForDfc - keystore file name is /tmp/089972.tmp/dfc.keystore
14:01:34,446 INFO [main] com.documentum.fc.client.security.impl.InitializeKeystoreForDfc - [DFC_SECURITY_IDENTITY_INITIALIZED] Initialized new identity in keystore, DFC alias=dfc, identity=dfc_MlM5tLi5T9u1r82AdbulKv14vr8a
14:01:34,448 INFO [main] com.documentum.fc.client.security.impl.AuthenticationMgrForDfc - identity for authentication is dfc_MlM5tLi5T9u1r82AdbulKv14vr8a
14:01:34,449 INFO [main] com.documentum.fc.impl.RuntimeContext - DFC Version is 16.4.0110.0058
14:01:34,472 INFO [Timer-3] com.documentum.fc.client.impl.bof.cache.ClassCacheManager$CacheCleanupTask - [DFC_BOF_RUNNING_CLEANUP] Running class cache cleanup task
...
[dmadmin@documentum-server-1 logs]$

 

Since this looks obviously as a bug, I opened a SR with the OpenText Support (#4252205). The outcome of this ticket is that the RCS/CFS installer is indeed doing a different validation that what is done by the PCS installer and that’s why the issue is only for RCS/CFS. At the moment, there is no way to skip this validation when using the silent installer (contrary to the GUI which allows you to ‘click Yes‘). Therefore, OpenText decided to add a new parameter starting with the CS 16.4 P20 (end of December 2019) to check whether the FQDN validation should be done or just skipped. This new parameter will be “SERVER.VALIDATE_FQDN” and it will be a Boolean value. The default value will be set to “true” and therefore by default, it will do the FQDN validation. To skip it starting with the P20, just put the value to false and the RCS/CFS installer should be able to complete successfully. To be tested once the patch is out!

 

Cet article Documentum – FQDN Validation on RCS/CFS est apparu en premier sur Blog dbi services.

Documentum – Usage of K8s Services to install Documentum?

Sun, 2019-12-01 02:00

In the past several months, we have been extensively working on setting up a CI/CD pipeline for Documentum at one of our customer. As part of this project, we are using Kubernetes pods for Documentum components. In this blog, I will talk about an issue caused by what seemed like a good idea but finally, not so much…

The goal of this project is to migrate dozens of Documentum environments and several hundred of VMs into K8s pods. In order to streamline the migration and simplify the management, we thought: why not try to use K8s Services (ingres) for all the communications between the pods as well as external to K8s. Indeed, we needed to take into account several interfaces outside of the K8s world, usually some old software that would most probably never support containerization and such. These interfaces will need to continue to work in the way they used to so we will need K8s Services at some point for the communications between Documentum and these external interfaces. Therefore, the idea was to try to use this exact same K8s Services to install the Documentum components.

By default, K8s will create a headless service for each of the pods, which is composed in the following way: <pod_name>.<service_name>.<namespace_name>.<cluster>. The goal here was therefore to define a K8s Service in addition for each Content Servers: <service_name_ext>.<namespace_name>.<cluster>. This is what has been used:

  • Primary Content Server:
    • headless/pod: documentum-server-0.documentum-server.dbi-ns01.svc.cluster.local
    • K8s Service: cs01.dbi-ns01.svc.cluster.local
  • Remote Content Server:
    • headless/pod: documentum-server-1.documentum-server.dbi-ns01.svc.cluster.local
    • K8s Service: cs02.dbi-ns01.svc.cluster.local
  • Repository & Service: gr_repo

On a typical VM, you would usually install Documentum using the VM hostname. The pendant on K8s would therefore be to use the headless/pod name. Alternatively, on a VM, you could think about using a DNS entry to install Documentum and you might think that this should work. I sure did and therefore, we tried to use the same kind of thing on K8s with the K8s Services directly.

Doing so for the Primary Content Server, all the Documentum silent installers completed successfully. We used “cs01.dbi-ns01.svc.cluster.local” for the following things for example:

  • Docbroker projections
  • Repository installation
  • DFC & CS Projections
  • BPM/xCP installation

Therefore, looking into the silent properties file for the Repository for example, it contained the following:

[dmadmin@documentum-server-0 ~]$ grep -E "FQDN|HOST" CS_Docbase_Global.properties
SERVER.FQDN=cs01.dbi-ns01.svc.cluster.local
SERVER.PROJECTED_DOCBROKER_HOST=cs01.dbi-ns01.svc.cluster.local
[dmadmin@documentum-server-0 ~]$

 

At the end of our silent installation (include Documentum silent installers + dbi services’ best practices (other stuff like security, JMS configuration, projections, jobs, aso…)), connection to the repository was possible, D2 & DA were both working properly so it looked like being a first good step. Unfortunately, when I was doing a review of the repository objects later, I saw some wrong objects and a bit of a mess in the repository: that’s the full purpose of this blog, to explain what went wrong when using a K8s Service instead of the headless/pod name.

After a quick review, I found the following things that were wrong/messy:

  • dm_jms_config object
    • Expected: for a Primary Content Server, you should have one JMS config object with “do_mail”, “do_method” and “SAMLAuthentication” at least (+ “do_bpm” for BPM/xCP, Indexagent ones, aso…)
      • JMS <FQDN>:9080 for gr_repo.gr_repo
    • Actual: the installer created two JMS Objects, one with a correct name (using FQDN provided in installer = K8s Service), one with a wrong name (using pod name (short-name, no domain))
      • JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo => Correct one and it contained all the needed servlets (“do_mail”, “do_method”, “do_bpm” and “SAMLAuthentication”)
      • JMS documentum-server-0:9080 for gr_repo.gr_repo => Wrong one and it contained all the do_ servlets but not the SAML one strangely (“do_mail”, “do_method” and “do_bpm” only, not “SAMLAuthentication”)
  • dm_acs_config object
    • Expected: just like for the JMS, you would expect the object to be created with the FQDN you gave it in the installer
      • <FQDN>ACS1
    • Actual: the installer create the ACS config object using the headless/pod name (full-name this time and not the short-name)
      • documentum-server-0.documentum-server.dbi-ns01.svc.cluster.localACS1
  • A lot of other references to the headless/pod name: dm_user, dm_job, dm_client_registration, dm_client_rights, aso…

So in short, sometimes the Repository installer uses the FQDN provided (K8s Service) and sometimes it doesn’t. So what’s the point in providing a FQDN during the installation since it will anyway ignore it for 90% of the objects? In addition, it also creates two JMS config objects at the same time but with different names and different servlets. Looking at the “dm_jms_config_setup.out” log file created by the installer when it executed the JMS config object creation, you can see that it mention the creation of only one object and yet at the ends, it says that there are two:

[dmadmin@documentum-server-0 ~]$ cat $DOCUMENTUM/dba/config/gr_repo/dm_jms_config_setup.out
/app/dctm/server/product/16.4/bin/dm_jms_admin.sh -docbase gr_repo.gr_repo -username dmadmin -action add,enableDFC,testDFC,migrate,dumpServerCache,listAll -jms_host_name cs01.dbi-ns01.svc.cluster.local -jms_port 9080                                               -jms_proximity 1 -webapps ServerApps -server_config_id 3d0f123450000102
2019-10-21 09:50:55 UTC:  Input arguments are: -docbase gr_repo.gr_repo -username dmadmin -action add,enableDFC,testDFC,migrate,dumpServerCache,listAll -jms_host_name cs01.dbi-ns01.svc.cluster.local -jms_port 9080 -jm                                              s_proximity 1 -webapps ServerApps -server_config_id 3d0f123450000102
2019-10-21 09:50:55 UTC:  Input parameters are: {jms_port=[9080], server_config_id=[3d0f123450000102], docbase=[gr_repo.gr_repo], webapps=[ServerApps], action=[add,enableDFC,testDFC,migrate,dumpServerCache,listAll], jms_pr                                              oximity=[1], jms_host_name=[cs01.dbi-ns01.svc.cluster.local], username=[dmadmin]}
2019-10-21 09:50:55 UTC:  ======================================================================================
2019-10-21 09:50:55 UTC:  Begin administering JMS config objects in docbase gr_repo.gr_repo ...
2019-10-21 09:51:01 UTC:  The following JMS config object has been successfully created/updated in docbase gr_repo
2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:                      JMS Config Name: JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo
                      JMS Config ID: 080f1234500010a3
                      JMS Host Name: cs01.dbi-ns01.svc.cluster.local
                    JMS Port Number: 9080
             Is Disabled In Docbase: F
               Repeating attributes:
               Content_Server_Id[0] = 3d0f123450000102
        Content_Server_Host_Name[0] = documentum-server-0
    JMS_Proximity_Relative_to_CS[0] = 2
             Servlet to URI Mapping:
                          do_method = http://cs01.dbi-ns01.svc.cluster.local:9080/DmMethods/servlet/DoMethod
                 SAMLAuthentication = http://cs01.dbi-ns01.svc.cluster.local:9080/SAMLAuthentication/servlet/ValidateSAMLResponse
                            do_mail = http://cs01.dbi-ns01.svc.cluster.local:9080/DmMail/servlet/DoMail

2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:  Successfully enabled principal_auth_priv for current DFC client  in docbase gr_repo
2019-10-21 09:51:01 UTC:  Successfully tested principal_auth_priv for current DFC client  in docbase gr_repo
2019-10-21 09:51:01 UTC:  Successfully migrated content server 3d0f123450000102 to use JMS config object(s)
2019-10-21 09:51:01 UTC:  Dump of JMS Config List in content server cache, content server is gr_repo
2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:  USER ATTRIBUTES

  jms_list_last_refreshed         : Mon Oct 21 09:51:01 2019
  incr_wait_time_on_failure       : 30
  max_wait_time_on_failure        : 3600
  current_jms_index               : -1
  jms_config_id                [0]: 080f1234500010a3
                               [1]: 080f1234500010a4
  jms_config_name              [0]: JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo
                               [1]: JMS documentum-server-0:9080 for gr_repo.gr_repo
  server_config_id             [0]: 3d0f123450000102
                               [1]: 3d0f123450000102
  server_config_name           [0]: gr_repo
                               [1]: gr_repo
  jms_to_cs_proximity          [0]: 2
                               [1]: 1
  is_disabled_in_docbase       [0]: F
                               [1]: F
  is_marked_dead_in_cache      [0]: F
                               [1]: F
  intended_purpose             [0]: DM_JMS_PURPOSE_FOR_LOAD_BALANCING
                               [1]: DM_JMS_PURPOSE_DEFAULT_EMBEDDED_JMS
  last_failure_time            [0]: N/A
                               [1]: N/A
  next_retry_time              [0]: N/A
                               [1]: N/A
  failure_count                [0]: 0
                               [1]: 0

SYSTEM ATTRIBUTES


APPLICATION ATTRIBUTES


INTERNAL ATTRIBUTES


2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:  Total 2 JMS Config objects found in docbase gr_repo
2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:                      JMS Config Name: JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo
                      JMS Config ID: 080f1234500010a3
                      JMS Host Name: cs01.dbi-ns01.svc.cluster.local
                    JMS Port Number: 9080
             Is Disabled In Docbase: F
               Repeating attributes:
               Content_Server_Id[0] = 3d0f123450000102
        Content_Server_Host_Name[0] = documentum-server-0
    JMS_Proximity_Relative_to_CS[0] = 2
             Servlet to URI Mapping:
                          do_method = http://cs01.dbi-ns01.svc.cluster.local:9080/DmMethods/servlet/DoMethod
                 SAMLAuthentication = http://cs01.dbi-ns01.svc.cluster.local:9080/SAMLAuthentication/servlet/ValidateSAMLResponse
                            do_mail = http://cs01.dbi-ns01.svc.cluster.local:9080/DmMail/servlet/DoMail

2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:                      JMS Config Name: JMS documentum-server-0:9080 for gr_repo.gr_repo
                      JMS Config ID: 080f1234500010a4
                      JMS Host Name: documentum-server-0
                    JMS Port Number: 9080
             Is Disabled In Docbase: F
               Repeating attributes:
               Content_Server_Id[0] = 3d0f123450000102
        Content_Server_Host_Name[0] = documentum-server-0
    JMS_Proximity_Relative_to_CS[0] = 1
             Servlet to URI Mapping:
                          do_method = http://documentum-server-0:9080/DmMethods/servlet/DoMethod
                            do_mail = http://documentum-server-0:9080/DmMail/servlet/DoMail

2019-10-21 09:51:01 UTC:  --------------------------------------------------------------------------------------
2019-10-21 09:51:01 UTC:  Done administering JMS config objects in docbase gr_repo.gr_repo: status=SUCCESS ...
2019-10-21 09:51:01 UTC:  ======================================================================================
Program exit status = 0 = SUCCESS
Connect to docbase gr_repo.gr_repo as user dmadmin.
Start running dm_jms_config_setup.ebs script on docbase gr_repo.gr_repo
[DM_API_E_NO_MATCH]error:  "There was no match in the docbase for the qualification: dm_method where object_name='dm_JMSAdminConsole'"


dm_method dm_JMSAdminConsole object does not exist, yet.
jarFile = /app/dctm/server/product/16.4/lib/dmjmsadmin.jar
wrapper_script = /app/dctm/server/product/16.4/bin/dm_jms_admin.sh
Create dm_method dm_JMSAdminConsole object in docbase now
new dm_JMSAdminConsole dm_method object created in docbase successfully
new object id is: 100f123450001098
Begin updating JMS_LOCATION for Java Methods ...
Assign JMS_LOCATION=ANY to a_extended_properties in method object CTSAdminMethod
Assign JMS_LOCATION=ANY to a_extended_properties in method object dm_bp_transition_java
Assign JMS_LOCATION=ANY to a_extended_properties in method object dm_bp_schedule_java
Assign JMS_LOCATION=ANY to a_extended_properties in method object dm_bp_batch_java
Assign JMS_LOCATION=ANY to a_extended_properties in method object dm_bp_validate_java
Assign JMS_LOCATION=ANY to a_extended_properties in method object dm_event_template_sender
Done updating JMS_LOCATION for Java Methods ...
Begin create default JMS config object for content server
Content Server version: 16.4.0110.0167  Linux64.Oracle
Content Server ID: 3d0f123450000102
dm_jms_config type id = 030f12345000017c
jms_count = 0
wrapper_script = /app/dctm/server/product/16.4/bin/dm_jms_admin.sh
script_params =  -docbase gr_repo.gr_repo -username dmadmin -action add,enableDFC,testDFC,migrate,dumpServerCache,listAll  -jms_host_name cs01.dbi-ns01.svc.cluster.local -jms_port 9080 -jms_proximity 1 -webapps Server                                              Apps  -server_config_id 3d0f123450000102
cmd = /app/dctm/server/product/16.4/bin/dm_jms_admin.sh  -docbase gr_repo.gr_repo -username dmadmin -action add,enableDFC,testDFC,migrate,dumpServerCache,listAll  -jms_host_name cs01.dbi-ns01.svc.cluster.local -jms_po                                              rt 9080 -jms_proximity 1 -webapps ServerApps  -server_config_id 3d0f123450000102
status = 0
Finished creating default JMS config object for content server
Finished running dm_jms_config_setup.ebs...
Disconnect from the docbase.
[dmadmin@documentum-server-0 ~]$

 

In the log file above, there is no mention of “do_bpm” because it’s the installation of the Repository and therefore, at that time, the BPM/xCP isn’t installed yet. We only install it later, switch the URLs in HTTPS and other things. So looking into the objects in the Repository, this is what we can see at the end of all installations (I purposely only executed the HTTP->HTTPS + BPM/xCP addition but not JMS Projections to keep below the default value added by the installer, which are also wrong):

[dmadmin@documentum-server-0 ~]$ iapi gr_repo
Please enter a user (dmadmin):
Please enter password for dmadmin:


        OpenText Documentum iapi - Interactive API interface
        Copyright (c) 2018. OpenText Corporation
        All rights reserved.
        Client Library Release 16.4.0110.0058


Connecting to Server using docbase gr_repo
[DM_SESSION_I_SESSION_START]info:  "Session 010f12345000117c started for user dmadmin."


Connected to OpenText Documentum Server running Release 16.4.0110.0167  Linux64.Oracle
Session id is s0
API> ?,c,select count(*) from dm_server_config;
count(*)
----------------------
                     1
(1 row affected)

API> ?,c,select r_object_id, object_name, app_server_name, app_server_uri from dm_server_config order by object_name, app_server_name;
r_object_id       object_name  app_server_name  app_server_uri
----------------  -----------  ---------------  -----------------------------------------------------------------------
3d0f123450000102  gr_repo      do_bpm           https://cs01.dbi-ns01.svc.cluster.local:9082/bpm/servlet/DoMethod
                               do_mail          https://cs01.dbi-ns01.svc.cluster.local:9082/DmMail/servlet/DoMail
                               do_method        https://cs01.dbi-ns01.svc.cluster.local:9082/DmMethods/servlet/DoMethod
(1 row affected)

API> ?,c,select count(*) from dm_jms_config;
count(*)
----------------------
                     2
(1 row affected)

API> ?,c,select r_object_id, object_name from dm_jms_config order by object_name;
r_object_id       object_name
----------------  ------------------------------------------------------------
080f1234500010a3  JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo
080f1234500010a4  JMS documentum-server-0:9080 for gr_repo.gr_repo
(2 rows affected)

API> dump,c,080f1234500010a3
...
USER ATTRIBUTES

  object_name                     : JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo
  title                           :
  subject                         :
  authors                       []: <none>
  keywords                      []: <none>
  resolution_label                :
  owner_name                      : dmadmin
  owner_permit                    : 7
  group_name                      : docu
  group_permit                    : 5
  world_permit                    : 3
  log_entry                       :
  acl_domain                      : dmadmin
  acl_name                        : dm_450f123450000101
  language_code                   :
  server_config_id             [0]: 3d0f123450000102
  config_type                     : 2
  servlet_name                 [0]: do_method
                               [1]: SAMLAuthentication
                               [2]: do_mail
                               [3]: do_bpm
  base_uri                     [0]: https://cs01.dbi-ns01.svc.cluster.local:9082/DmMethods/servlet/DoMethod
                               [1]: https://cs01.dbi-ns01.svc.cluster.local:9082/SAMLAuthentication/servlet/ValidateSAMLResponse
                               [2]: https://cs01.dbi-ns01.svc.cluster.local:9082/DmMail/servlet/DoMail
                               [3]: https://cs01.dbi-ns01.svc.cluster.local:9082/bpm/servlet/DoMethod
  supported_protocol           [0]: https
                               [1]: https
                               [2]: https
                               [3]: https
  projection_netloc_enable      []: <none>
  projection_netloc_ident       []: <none>
  projection_enable            [0]: T
  projection_proximity_value   [0]: 2
  projection_targets           [0]: documentum-server-0
  projection_ports             [0]: 0
  network_locations             []: <none>
  server_major_version            :
  server_minor_version            :
  is_disabled                     : F

SYSTEM ATTRIBUTES

  r_object_type                   : dm_jms_config
  r_creation_date                 : 10/21/2019 09:51:00
  r_modify_date                   : 10/21/2019 10:49:08
  r_modifier                      : dmadmin
  r_access_date                   : nulldate
  r_composite_id                []: <none>
  r_composite_label             []: <none>
  r_component_label             []: <none>
  r_order_no                    []: <none>
  r_link_cnt                      : 0
  r_link_high_cnt                 : 0
  r_assembled_from_id             : 0000000000000000
  r_frzn_assembly_cnt             : 0
  r_has_frzn_assembly             : F
  r_is_virtual_doc                : 0
  r_page_cnt                      : 0
  r_content_size                  : 0
  r_lock_owner                    :
  r_lock_date                     : nulldate
  r_lock_machine                  :
  r_version_label              [0]: 1.0
                               [1]: CURRENT
  r_immutable_flag                : F
  r_frozen_flag                   : F
  r_has_events                    : F
  r_creator_name                  : dmadmin
  r_is_public                     : T
  r_policy_id                     : 0000000000000000
  r_resume_state                  : 0
  r_current_state                 : 0
  r_alias_set_id                  : 0000000000000000
  r_full_content_size             : 0
  r_aspect_name                 []: <none>
  r_object_id                     : 080f1234500010a3

APPLICATION ATTRIBUTES

  a_application_type              :
  a_status                        :
  a_is_hidden                     : F
  a_retention_date                : nulldate
  a_archive                       : F
  a_compound_architecture         :
  a_link_resolved                 : F
  a_content_type                  :
  a_full_text                     : T
  a_storage_type                  :
  a_special_app                   :
  a_effective_date              []: <none>
  a_expiration_date             []: <none>
  a_publish_formats             []: <none>
  a_effective_label             []: <none>
  a_effective_flag              []: <none>
  a_category                      :
  a_is_template                   : F
  a_controlling_app               :
  a_extended_properties         []: <none>
  a_is_signed                     : F
  a_last_review_date              : nulldate

INTERNAL ATTRIBUTES

  i_is_deleted                    : F
  i_reference_cnt                 : 1
  i_has_folder                    : T
  i_folder_id                  [0]: 0c0f123450000105
  i_contents_id                   : 0000000000000000
  i_cabinet_id                    : 0c0f123450000105
  i_antecedent_id                 : 0000000000000000
  i_chronicle_id                  : 080f1234500010a3
  i_latest_flag                   : T
  i_branch_cnt                    : 0
  i_direct_dsc                    : F
  i_is_reference                  : F
  i_retain_until                  : nulldate
  i_retainer_id                 []: <none>
  i_partition                     : 0
  i_is_replica                    : F
  i_vstamp                        : 4

API> dump,c,080f1234500010a4
...
USER ATTRIBUTES

  object_name                     : JMS documentum-server-0:9080 for gr_repo.gr_repo
  title                           :
  subject                         :
  authors                       []: <none>
  keywords                      []: <none>
  resolution_label                :
  owner_name                      : dmadmin
  owner_permit                    : 7
  group_name                      : docu
  group_permit                    : 5
  world_permit                    : 3
  log_entry                       :
  acl_domain                      : dmadmin
  acl_name                        : dm_450f123450000101
  language_code                   :
  server_config_id             [0]: 3d0f123450000102
  config_type                     : 2
  servlet_name                 [0]: do_method
                               [1]: do_mail
                               [2]: do_bpm
  base_uri                     [0]: https://documentum-server-0:9082/DmMethods/servlet/DoMethod
                               [1]: https://documentum-server-0:9082/DmMail/servlet/DoMail
                               [2]: https://cs01.dbi-ns01.svc.cluster.local:9082/bpm/servlet/DoMethod
  supported_protocol           [0]: https
                               [1]: https
                               [2]: https
  projection_netloc_enable      []: <none>
  projection_netloc_ident       []: <none>
  projection_enable            [0]: T
  projection_proximity_value   [0]: 1
  projection_targets           [0]: documentum-server-0
  projection_ports             [0]: 0
  network_locations             []: <none>
  server_major_version            :
  server_minor_version            :
  is_disabled                     : F

SYSTEM ATTRIBUTES

  r_object_type                   : dm_jms_config
  r_creation_date                 : 10/21/2019 09:51:01
  r_modify_date                   : 10/21/2019 10:50:20
  r_modifier                      : dmadmin
  r_access_date                   : nulldate
  r_composite_id                []: <none>
  r_composite_label             []: <none>
  r_component_label             []: <none>
  r_order_no                    []: <none>
  r_link_cnt                      : 0
  r_link_high_cnt                 : 0
  r_assembled_from_id             : 0000000000000000
  r_frzn_assembly_cnt             : 0
  r_has_frzn_assembly             : F
  r_is_virtual_doc                : 0
  r_page_cnt                      : 0
  r_content_size                  : 0
  r_lock_owner                    :
  r_lock_date                     : nulldate
  r_lock_machine                  :
  r_version_label              [0]: 1.0
                               [1]: CURRENT
  r_immutable_flag                : F
  r_frozen_flag                   : F
  r_has_events                    : F
  r_creator_name                  : dmadmin
  r_is_public                     : T
  r_policy_id                     : 0000000000000000
  r_resume_state                  : 0
  r_current_state                 : 0
  r_alias_set_id                  : 0000000000000000
  r_full_content_size             : 0
  r_aspect_name                 []: <none>
  r_object_id                     : 080f1234500010a4

APPLICATION ATTRIBUTES

  a_application_type              :
  a_status                        :
  a_is_hidden                     : F
  a_retention_date                : nulldate
  a_archive                       : F
  a_compound_architecture         :
  a_link_resolved                 : F
  a_content_type                  :
  a_full_text                     : T
  a_storage_type                  :
  a_special_app                   :
  a_effective_date              []: <none>
  a_expiration_date             []: <none>
  a_publish_formats             []: <none>
  a_effective_label             []: <none>
  a_effective_flag              []: <none>
  a_category                      :
  a_is_template                   : F
  a_controlling_app               :
  a_extended_properties         []: <none>
  a_is_signed                     : F
  a_last_review_date              : nulldate

INTERNAL ATTRIBUTES

  i_is_deleted                    : F
  i_reference_cnt                 : 1
  i_has_folder                    : T
  i_folder_id                  [0]: 0c0f123450000105
  i_contents_id                   : 0000000000000000
  i_cabinet_id                    : 0c0f123450000105
  i_antecedent_id                 : 0000000000000000
  i_chronicle_id                  : 080f1234500010a4
  i_latest_flag                   : T
  i_branch_cnt                    : 0
  i_direct_dsc                    : F
  i_is_reference                  : F
  i_retain_until                  : nulldate
  i_retainer_id                 []: <none>
  i_partition                     : 0
  i_is_replica                    : F
  i_vstamp                        : 2

API> ?,c,select r_object_id, object_name, servlet_name, supported_protocol, base_uri from dm_jms_config order by object_name, servlet_name;
r_object_id       object_name                                                   servlet_name        supported_protocol  base_uri                                                                                                                  
----------------  ------------------------------------------------------------  ------------------  ------------------  --------------------------------------------------------------------------------------------
080f1234500010a3  JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo  SAMLAuthentication  https               https://cs01.dbi-ns01.svc.cluster.local:9082/SAMLAuthentication/servlet/ValidateSAMLResponse
                                                                                do_bpm              https               https://cs01.dbi-ns01.svc.cluster.local:9082/bpm/servlet/DoMethod
                                                                                do_mail             https               https://cs01.dbi-ns01.svc.cluster.local:9082/DmMail/servlet/DoMail
                                                                                do_method           https               https://cs01.dbi-ns01.svc.cluster.local:9082/DmMethods/servlet/DoMethod
080f1234500010a4  JMS documentum-server-0:9080 for gr_repo.gr_repo              do_bpm              https               https://cs01.dbi-ns01.svc.cluster.local:9082/bpm/servlet/DoMethod
                                                                                do_mail             https               https://documentum-server-0:9082/DmMail/servlet/DoMail
                                                                                do_method           https               https://documentum-server-0:9082/DmMethods/servlet/DoMethod
(2 rows affected)

API> ?,c,select r_object_id, object_name, projection_enable, projection_proximity_value, projection_ports, projection_targets from dm_jms_config order by object_name, projection_targets;
r_object_id       object_name                                                   projection_enable  projection_proximity_value  projection_ports  projection_targets
----------------  ------------------------------------------------------------  -----------------  --------------------------  ----------------  -------------------
080f1234500010a3  JMS cs01.dbi-ns01.svc.cluster.local:9080 for gr_repo.gr_repo                  1                           2                 0  documentum-server-0
080f1234500010a4  JMS documentum-server-0:9080 for gr_repo.gr_repo                              1                           1                 0  documentum-server-0
(2 rows affected)

API> ?,c,select count(*) from dm_acs_config;
count(*)
----------------------
                     1
(1 row affected)

API> ?,c,select r_object_id, object_name, acs_supported_protocol, acs_base_url from dm_acs_config order by object_name, acs_base_url;
r_object_id       object_name                                                           acs_supported_protocol  acs_base_url
----------------  --------------------------------------------------------------------  ----------------------  ------------------------------------------------------------
080f123450000490  documentum-server-0.documentum-server.dbi-ns01.svc.cluster.localACS1  https                   https://cs01.dbi-ns01.svc.cluster.local:9082/ACS/servlet/ACS
(1 row affected)

API> exit
Bye
[dmadmin@documentum-server-0 ~]$

 

So what to do with that? Well a simple solution is to just remove the wrong JMS config object (the second one) and redo the JMS Projections. You can stay with the wrong name of the ACS config object and other wrong references: even if it’s ugly, it will be working properly, it’s really just the second JMS config object that might cause you some trouble. Either scripting all that so it’s done properly in the end or doing it manually but then obviously when you have a project with a few hundred Content Servers, a simple manual task can become a nightmare ;). Another obvious solution is to not use the K8s Service but stick with the headless/pod name. With this second solution, you might as well try to use the MigrationUtil utility to change all references to the hostname after the installation is done. That would be something interesting to test!

 

Cet article Documentum – Usage of K8s Services to install Documentum? est apparu en premier sur Blog dbi services.

Documentum – Database password validation rule in 16.4

Sun, 2019-12-01 00:00

A few months ago, I started working with the CS 16.4 (always using silent installation) and I had the pleasant surprise to see a new error message in the installation log. It’s always such a pleasure to lose time on pretty stupid things like the one I will talk about in this blog.

So what’s the issue? Well upon installing a new repository, I saw an error message around the start of the silent installation. In the end, the process didn’t stop and the repository was actually installed and functional – as far as I could see – but I needed to check this deeper, to be sure that there were no problem. This is an extract of the installation log showing the exact error message:

[dmadmin@documentum-server-0 ~]$ cd $DM_HOME/install/logs
[dmadmin@documentum-server-0 logs]$ cat install.log
14:45:02,608  INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - The product name is: UniversalServerConfigurator
14:45:02,608  INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - The product version is: 16.4.0000.0248
14:45:02,608  INFO [main]  -
14:45:02,660  INFO [main] com.documentum.install.shared.installanywhere.actions.InitializeSharedLibrary - Done InitializeSharedLibrary ...
14:45:02,698  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerInformation - Setting CONFIGURE_DOCBROKER value to TRUE for SERVER
14:45:02,699  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerInformation - Setting CONFIGURE_DOCBASE value to TRUE for SERVER
14:45:03,701  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerCheckEnvrionmentVariable - The installer was started using the dm_launch_server_config_program.sh script.
14:45:03,701  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerCheckEnvrionmentVariable - The installer will determine the value of environment variable DOCUMENTUM.
14:45:06,702  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerCheckEnvrionmentVariable - The installer will determine the value of environment variable PATH.
14:45:09,709 ERROR [main] com.documentum.install.server.installanywhere.actions.DiWAServerValidteVariables - Invalid database user password. Valid database user password rules are:
1. Must contain only ASCII alphanumeric characters,'.', '_' and '-'.
Please enter a valid database user password.
14:45:09,717  INFO [main]  - The license file:/app/dctm/server/dba/tcs_license exists.
14:45:09,721  INFO [main] com.documentum.install.server.installanywhere.actions.DiWASilentConfigurationInstallationValidation - Start to validate docbase parameters.
14:45:09,723  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerPatchExistingDocbaseAction - The installer will obtain all the DOCBASE on the machine.
14:45:11,742  INFO [main] com.documentum.install.server.installanywhere.actions.DiWAServerDocAppFolder - The installer will obtain all the DocApps which could be installed for the repository.
...
[dmadmin@documentum-server-0 logs]$

 

As you can see above, the error message is self-explanatory: the Database password used doesn’t comply with the so-called “rules”. Seeing this kind of message, you would expect the installer to stop since the password doesn’t comply. It shouldn’t install the Repository. Yet it just skip it and completes without problem.

On my side, I have always been using the same rule for passwords in Documentum: at least 1 lowercase, 1 uppercase, 1 figure, 1 special character and a total of 15 or more characters. Just comparing the password that has been used for the Database with what is printed on the log, the only reason why the password wouldn’t be correct is because I put a ‘+’ in it. In previous versions of Documentum, I often used a ‘+’ and I never had any issue or errors with it.

So I checked with the OpenText Support (#4240691) to have more details on what is happening here. Turns out that starting with the CS 16.4, OpenText added a new password validation for the Database account and that this password must indeed only contain alphanumeric characters, ‘.’, ‘_’ or ‘-‘… So they added a password validation which complains but it’s not doing anything. Actually, it’s even worse because the CS Team added this password validation with the CS 16.4 and they enforced this rule but only for the GUI installer. The same check was added only later to the silent installation but it was not enforced at that time. That’s the reason why if you would try using the same password on the GUI, it should fail while with the silent installation, it prints an error but it still complete successfully… Therefore, with the same binaries, you have two different behaviors. That’s pretty cool, right? Right? RIGHT?

In the end, a new defect (#CS-121161) has been raised and they will enforce the rule in a coming patch it seems. Therefore, if you are planning to use ‘+’ characters in your Database passwords, consider changing it upfront to avoid a failure in the Repository installation. Looks like this time I should have stayed quiet and maybe I would have been able to use ‘+’ for the next 10 years using the silent installations… Sorry!

 

Cet article Documentum – Database password validation rule in 16.4 est apparu en premier sur Blog dbi services.

Focus on 19c NOW!

Fri, 2019-11-29 08:00
Introduction

For years, Oracle used the same mechanism for database versioning. A major version, represented by the first number. And then a release number, 1 for the very first edition, and a mature and reliable release 2 for production databases. Both of them having patchsets (the last number) and regular patchset updates (the date optionally displayed at the end) to remove bugs and to increase security. Jumping from release 1 to release 2 required a migration as if you were coming from an older version. Recently, Oracle broke this release pace to introduce a new versioning system based on the year of release, like Microsoft and a lot of others did. Patchsets are also replaced by release updates. Quite obvious: it’s been a long time patchsets have become complete releases. Lots of Oracle DBAs are now in the fog, and as a result, could take wrong decision regarding the version to choose.

A recent history of Oracle Database versioning

Let’s focus on the versions currently running on most of customer’s databases:

  • 11.2.0.4: The terminal version of 11gR2 (long-term). 4 is the latest patchset of the 11gR2, there will never exist a 11.2.0.5. If you install the latest PSU (Patchset update) your database will precisely run on 11.2.0.4.191015 (as of the 29th of November 2019)
  • 12.1.0.2: The terminal version of 12cR1 (sort of long-term). A 12.1.0.1 existed but for a very short time
  • 12.2.0.1: first version of 12cR2 (short-term). This is the latest version with old versioning model
  • 18c: actually 12.2.0.2 – first patchset of the 12.2.0.1 (short-term). You cannot apply this patchset on top of the 12.2.0.1
  • 19c: actually 12.2.0.3 – terminal version of the 12cR2 (long-term). The next version will no more be based on 12.2 database kernel

18c and 19c also have sort of patchset but the name has changed: we’re now talking about RU (release update). RU are actually the second number, 18.8 for example. Each release update can also be updated with PSUs, still the last number, for example 18.8.0.0.191015.

Is there a risk to use older versions?

Actually, there is no risk using 11.2.0.4 and 12.1.0.2. These versions represent almost all the Oracle databases running in the world. Few people already migrated to 12.2 or newer versions. The risk is more related to the support provided by Oracle. With premier support (linked to the support fees almost every customer pay each year), you have limited access to My Oracle Support. Looking up for something in the knowledge database is OK, downloading old patches is OK, but downloading newest patches will no more be possible. And if you open a SR, the Oracle support team could ask you to buy extended support, or at least to apply the latest PSU you cannot download. If you want to keep your databases fully supported by Oracle, you’ll have to ask and pay for extended support, as far as your version can still be supported with this kind of support. For sure, 11gR1, 10gR2 and older versions are no more eligible for extended support.

Check this My Oracle Support note for fresh information about support timeline: Doc ID 742060.1

Should I migrate to 12.2 or 18c?

If you plan to migrate to 12.2 or 18c in 2020, think twice. The problem with these versions is that premier support is ending soon: before the end of 2020 for 12.2 and in the middle of 2021 for 18c. It’s very short and you probably won’t have the possibility to buy extended support (these are not terminal releases), you’ll have to migrate to 19c or newer version in 2020 or 2021.

Why 19c is probably the only version you should migrate to?

19c is the long-term support release, meaning that premier support will last longer (until 2023) and also that extended support will be available (until 2026). If you plan to migrate to 19c in 2020, you will benefit from all the desired patches and full support for 3 years. And there is a chance that Oracle will also offer extended support for the first year or more, as they did for 11.2 and 12.1, even it’s pure assumption.

How about the costs?

You probably own perpetual licenses, meaning that the Oracle database product is yours (if you are compliant regarding the number of users or processors defined in your contract). Your licenses are not attached to a specific version, you can use 11gR2, 12c, 18c, 19c… Each year, you pay support fees: these fees give you access to My Oracle Support, for downloading patches or opening a Service Request in case of problem. But you are supposed to run recent version of the database with this premier support. For example, as of the 29th of November 2019, the versions supported with premier support are 12.2.0.1, 18c and 19c. If you’re using older versions, like 12.1.0.2 or 11.2.0.4, you should pay additional fees for extended support. Extended support is not something you have to subscribe indefinitely, as the purpose is only to keep your database supported before you migrate to a newer version and return to premier support.

So, keeping older versions will cost you more, and in-time migration will keep your support fees as low as possible.

For sure, migrating to 19c also comes at a cost, but we’re now quite aware of the importance of migrating software and stay up to date for a lot of reasons.

Conclusion

Motivate your software vendor or your development team to validate and support 19c. The amount of work for supporting 19c against 18c or 12c is quite the same. All these versions being actually 12c. The behaviour of the database will be the same for most of us. Avoid migrating to 12.2.0.1 or 18c as you’ll have to migrate again in 1 year. Keep your 11gR2 and/or 12cR1 and take extended support for one year while preparing the migration to 19c if you’re not yet ready. 20c will be a kind of very first release 1: you probably won’t migrate to this version if you mostly consider stability and reliability for your databases.

Cet article Focus on 19c NOW! est apparu en premier sur Blog dbi services.

dbvisit dbvctl process is terminating abnormally with Error Code: 2044

Fri, 2019-11-29 03:07

When applying archive logs on the standby, dbvctl process can terminate abnormally with Error Code: 2044. This can happen in case there are several archive logs with huge size to be applied.

Problem description

With dbvisit there is 2 ways to recover the archive log on the standby, either using sqlplus or rman. By default the configuration is set to sqlplus. It can happen that, following a maintenance windows where synchronization had to be suspended, a huge gap is faced between the primary and the standby databases. Several archive logs need to be applied. Problem is even more visible if the size of the archive log files is big. In my case there were about 34 archive logs to be applied following a maintenance activity and size of each file was 8 GB.

Applying the archive logs on the standby failed as seen in the following output.

oracle@server_name:/u01/app/dbvisit/standby/ [DB_name] ./dbvctl -d DDC_name
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 1939)
dbvctl started on server_name: Mon Oct 28 16:19:01 2019
=============================================================
 
 
>>> Applying Log file(s) from primary_server to DB_name on standby_server:
 
 
Dbvisit Standby terminated...
Error Code: 2044
File (/u01/app/dbvisit/standby/tmp/1939.dbvisit.201910281619.sqlplus.dbv) does not
exist or is empty. Please check space and file permissions.
 
Tracefile from server: server_name (PID:1939)
1939_dbvctl_DB_name_201910281619.trc
 
oracle@server_name:/u01/app/dbvisit/standby/ [DB_name] ls -l /u01/app/dbvisit/standby/tmp/1939.dbvisit.201910281619.sqlplus.dbv
ls: cannot access /u01/app/dbvisit/standby/tmp/1939.dbvisit.201910281619.sqlplus.dbv: No such file or directory

Solution

To solve this problem, you can change DDC configuration on the primary to use RMAN to apply archive log, at least time the gap is caught up. You will have to synchronize the standby configuration as well.
To use RMAN, set APPLY_ARCHIVE_RMAN parameter to Y in the DDC configuration file.

Procedure is described below :

Backup the DDC configuration file

oracle@server_name:/u01/app/dbvisit/standby/ [DB_name] cd conf
oracle@server_name:/u01/app/dbvisit/standby/conf/ [DB_name] cp -p dbv_DCC_name.env dbv_DCC_name.env.20191028

Change the parameter

oracle@server_name:/u01/app/dbvisit/standby/conf/ [DB_name] vi dbv_DCC_name.env
oracle@server_name:/u01/app/dbvisit/standby/conf/ [DB_name] diff dbv_DCC_name.env dbv_DCC_name.env.20191028
543c543
APPLY_ARCHIVE_RMAN = Y
---
APPLY_ARCHIVE_RMAN = N

Send the configuration changes to the standby

oracle@server_name:/u01/app/dbvisit/standby/conf/ [DB_name] cd ..
oracle@server_name:/u01/app/dbvisit/standby/ [DB_name] ./dbvctl -d DCC_name -C
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 9318)
dbvctl started on server_name: Mon Oct 28 16:51:45 2019
=============================================================
 
>>> Dbvisit Standby configurational differences found between primary_server and standby_server.
Synchronised.
 
=============================================================
dbvctl ended on server_name: Mon Oct 28 16:51:52 2019
=============================================================

Apply archive log on the standby again and it will be completed successfully

oracle@server_name:/u01/app/dbvisit/standby/ [DB_name] ./dbvctl -d DDC_name
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 50909)
dbvctl started on server_name: Mon Oct 28 16:53:05 2019
=============================================================
 
 
>>> Applying Log file(s) from from primary_server to DB_name on standby_server:
 
 
Next SCN required for recovery 3328390017 generated at 2019-10-28:11:57:42 +01:00.
Next log(s) required for recovery:
thread 1 sequence 77553
>>> Searching for new archive logs under /u03/app/oracle/dbvisit_arch/DB_name_SITE2... done
thread 1 sequence 77553 (1_77553_973158276.arc)
thread 1 sequence 77554 (1_77554_973158276.arc)
thread 1 sequence 77555 (1_77555_973158276.arc)
thread 1 sequence 77556 (1_77556_973158276.arc)
thread 1 sequence 77557 (1_77557_973158276.arc)
thread 1 sequence 77558 (1_77558_973158276.arc)
thread 1 sequence 77559 (1_77559_973158276.arc)
thread 1 sequence 77560 (1_77560_973158276.arc)
thread 1 sequence 77561 (1_77561_973158276.arc)
thread 1 sequence 77562 (1_77562_973158276.arc)
thread 1 sequence 77563 (1_77563_973158276.arc)
thread 1 sequence 77564 (1_77564_973158276.arc)
thread 1 sequence 77565 (1_77565_973158276.arc)
thread 1 sequence 77566 (1_77566_973158276.arc)
thread 1 sequence 77567 (1_77567_973158276.arc)
thread 1 sequence 77568 (1_77568_973158276.arc)
thread 1 sequence 77569 (1_77569_973158276.arc)
thread 1 sequence 77570 (1_77570_973158276.arc)
thread 1 sequence 77571 (1_77571_973158276.arc)
thread 1 sequence 77572 (1_77572_973158276.arc)
thread 1 sequence 77573 (1_77573_973158276.arc)
thread 1 sequence 77574 (1_77574_973158276.arc)
>>> Catalog archives... done
>>> Recovering database... done
Last applied log(s):
thread 1 sequence 77574
 
Next SCN required for recovery 3331579974 generated at 2019-10-28:16:50:18 +01:00.
Next required log thread sequence
 
>>> Dbvisit Archive Management Module (AMM)
 
Config: number of archives to keep = 0
Config: number of days to keep archives = 7
Config: diskspace full threshold = 80%
==========
 
Processing /u03/app/oracle/dbvisit_arch/DB_name_SITE2...
Archive log dir: /u03/app/oracle/dbvisit_arch/DB_name_SITE2
Total number of archive files : 1025
Number of archive logs deleted = 8
Current Disk percent full : 51%
 
=============================================================
dbvctl ended on server_name: Mon Oct 28 17:10:36 2019
=============================================================

Cet article dbvisit dbvctl process is terminating abnormally with Error Code: 2044 est apparu en premier sur Blog dbi services.

ODA hang/crash due to software raid-check

Fri, 2019-11-29 02:33

Oracle Database Appliance (ODA) is by default configured with software raid for Operating System and Oracle Database software file system (2 internal SSD disks). 2 raid devices are configured : md0 and md1.ODA are configured to run raid-check every Sunday at 1am.

Analysing the problem

In case the ODA is having some load during raid-check, it can happen that the server freezes. Only IP layer seems to still be alive : server is replying to the ping command, but ssh layer is not available any more.
Nothing can be done with the ODA : no ssh connection, all logs and writes on the server are stuck, ILOM serial connection is impossible.

The only solution is to power cycle the ODA through ILOM.

Problem could be reproduced on customer side by running 2 RMAN database backups and manually executing the raid-check.

In /var/log/messages we can see that server hung doing raid-check on md1 :

Oct 27 01:00:01 ODA02 kernel: [6245829.462343] md: data-check of RAID array md0
Oct 27 01:00:01 ODA02 kernel: [6245829.462347] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
Oct 27 01:00:01 ODA02 kernel: [6245829.462349] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
Oct 27 01:00:01 ODA02 kernel: [6245829.462364] md: using 128k window, over a total of 511936k.
Oct 27 01:00:04 ODA02 kernel: [6245832.154108] md: md0: data-check done.
Oct 27 01:01:02 ODA02 kernel: [6245890.375430] md: data-check of RAID array md1
Oct 27 01:01:02 ODA02 kernel: [6245890.375433] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
Oct 27 01:01:02 ODA02 kernel: [6245890.375435] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
Oct 27 01:01:02 ODA02 kernel: [6245890.375452] md: using 128k window, over a total of 467694592k.
Oct 27 04:48:07 ODA02 kernel: imklog 5.8.10, log source = /proc/kmsg started. ==> Restart of ODA with ILOM, server freezed on data-check of RAID array md1
Oct 27 04:48:07 ODA02 rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="5788" x-info="http://www.rsyslog.com"] start
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Initializing cgroup subsys cpuset
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Initializing cgroup subsys cpu
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Initializing cgroup subsys cpuacct
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Linux version 4.1.12-124.20.3.el6uek.x86_64 (mockbuild@ca-build84.us.oracle.com) (gcc version 4.9.2 20150212 (Red Hat 4.9.2-6.2.0.3) (GCC) ) #2 SMP Thu Oct 11 17:47:32 PDT 2018
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Command line: ro root=/dev/mapper/VolGroupSys-LogVolRoot rd_NO_LUKS rd_MD_UUID=424664a7:c29524e9:c7e10fcf:d893414e rd_LVM_LV=VolGroupSys/LogVolRoot rd_LVM_LV=VolGroupSys/LogVolSwap SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM pci=noaer crashkernel=256M@64M loglevel=3 panic=60 transparent_hugepage=never biosdevname=1 ipv6.disable=1 intel_idle.max_cstate=1 nofloppy nomce numa=off console=ttyS0,115200n8 console

Solution Reduce raid check CPU and IO priority

By default raid check is configured with low priority. Setting the priority to idle would ensure to limit the resource used by the check.

Change NICE=low to NICE=idle in /etc/sysconfig/raid-check configuration file.

[root@ODA02 log]# cat /etc/sysconfig/raid-check
#!/bin/bash
#
# Configuration file for /usr/sbin/raid-check
#
# options:
# ENABLED - must be yes in order for the raid check to proceed
# CHECK - can be either check or repair depending on the type of
# operation the user desires. A check operation will scan
# the drives looking for bad sectors and automatically
# repairing only bad sectors. If it finds good sectors that
# contain bad data (meaning that the data in a sector does
# not agree with what the data from another disk indicates
# the data should be, for example the parity block + the other
# data blocks would cause us to think that this data block
# is incorrect), then it does nothing but increments the
# counter in the file /sys/block/$dev/md/mismatch_count.
# This allows the sysadmin to inspect the data in the sector
# and the data that would be produced by rebuilding the
# sector from redundant information and pick the correct
# data to keep. The repair option does the same thing, but
# when it encounters a mismatch in the data, it automatically
# updates the data to be consistent. However, since we really
# don't know whether it's the parity or the data block that's
# correct (or which data block in the case of raid1), it's
# luck of the draw whether or not the user gets the right
# data instead of the bad data. This option is the default
# option for devices not listed in either CHECK_DEVS or
# REPAIR_DEVS.
# CHECK_DEVS - a space delimited list of devs that the user specifically
# wants to run a check operation on.
# REPAIR_DEVS - a space delimited list of devs that the user
# specifically wants to run a repair on.
# SKIP_DEVS - a space delimited list of devs that should be skipped
# NICE - Change the raid check CPU and IO priority in order to make
# the system more responsive during lengthy checks. Valid
# values are high, normal, low, idle.
# MAXCONCURENT - Limit the number of devices to be checked at a time.
# By default all devices will be checked at the same time.
#
# Note: the raid-check script is run by the /etc/cron.d/raid-check cron job.
# Users may modify the frequency and timing at which raid-check is run by
# editing that cron job and their changes will be preserved across updates
# to the mdadm package.
#
# Note2: you can not use symbolic names for the raid devices, such as you
# /dev/md/root. The names used in this file must match the names seen in
# /proc/mdstat and in /sys/block.
 
ENABLED=yes
CHECK=check
NICE=idle
# To check devs /dev/md0 and /dev/md3, use "md0 md3"
CHECK_DEVS=""
REPAIR_DEVS=""
SKIP_DEVS=""
MAXCONCURRENT=

Change raid-check scheduling

Configure raid-check to be run in low activity period. Avoid running raid check during database backup periods for example.

[root@ODA02 ~]# cd /etc/cron.d
 
[root@ODA02 cron.d]# cat raid-check
# Run system wide raid-check once a week on Sunday at 1am by default
0 1 * * Sun root /usr/sbin/raid-check
 
[root@ODA02 cron.d]# vi raid-check
 
[root@ODA02 cron.d]# cat raid-check
# Run system wide raid-check once a week on Sunday at 1am by default
0 19 * * Sat root /usr/sbin/raid-check

Conclusion

These configuration changes could be successfully tested on customer environment. No crash/hang was experienced with NICE parameter set to idle.
As per the oracle documentation, the ODA BIOS default configuration could be change to use hardware raid.
ODA – configuring RAID
The question would be if patching an ODA is still possible afterwards. If you would like to changed this configuration I would strongly recommend you to get Oracle support approval.

Cet article ODA hang/crash due to software raid-check est apparu en premier sur Blog dbi services.

Enabling, disabling, and validating foreign key constraints in PostgreSQL

Thu, 2019-11-28 01:39

Constraints are in important concept in every realtional database system and they guarantee the correctness of your data. While constraints are essentials there are situations when it is required to disable or drop them temporarily. The reason could be performance related because it is faster to validate the constraints at once after a data load. The reason could also be, that you need to load data and you do not know if the data is ordered in such a way that all foreign keys will validate for the time the data is loaded. In such a case it is required to either drop the constraints or to disable them until the data load is done. Validation of the constraints is deferred until all your data is there.

As always lets start with a simple test case, two tables, the second one references the first one:

postgres=# create table t1 ( a int primary key
postgres(#                 , b text
postgres(#                 , c date
postgres(#                 );
CREATE TABLE
postgres=# create table t2 ( a int primary key
postgres(#                 , b int references t1(a)
postgres(#                 , c text
postgres(#                 );
CREATE TABLE

Two rows, for each of them:

postgres=# insert into t1 (a,b,c) values(1,'aa',now());
INSERT 0 1
postgres=# insert into t1 (a,b,c) values(2,'bb',now());
INSERT 0 1
postgres=# insert into t2 (a,b,c) values (1,1,'aa');
INSERT 0 1
postgres=# insert into t2 (a,b,c) values (2,2,'aa');

Currently the two tiny tables look like this:

postgres=# \d t1
                 Table "public.t1"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | text    |           |          | 
 c      | date    |           |          | 
Indexes:
    "t1_pkey" PRIMARY KEY, btree (a)
Referenced by:
    TABLE "t2" CONSTRAINT "t2_b_fkey" FOREIGN KEY (b) REFERENCES t1(a)

postgres=# \d t2
                 Table "public.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | integer |           |          | 
 c      | text    |           |          | 
Indexes:
    "t2_pkey" PRIMARY KEY, btree (a)
Foreign-key constraints:
    "t2_b_fkey" FOREIGN KEY (b) REFERENCES t1(a)

postgres=# 

Lets assume we want to load some data provided by a script. As we do not know the ordering of the data in the script we decide to disable the foreign key constraint on the t2 table and validate it after the load:

postgres=# alter table t2 disable trigger all;
ALTER TABLE

The syntax might look a bit strange but it actually does disable the foreign key and it would have disabled all the foreign keys if there would have been more than one. It becomes more clear when we look at the table again:

postgres=# \d t2
                 Table "public.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | integer |           |          | 
 c      | text    |           |          | 
Indexes:
    "t2_pkey" PRIMARY KEY, btree (a)
Foreign-key constraints:
    "t2_b_fkey" FOREIGN KEY (b) REFERENCES t1(a)
Disabled internal triggers:
    "RI_ConstraintTrigger_c_16460" AFTER INSERT ON t2 FROM t1 NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE FUNCTION "RI_FKey_check_ins"()
    "RI_ConstraintTrigger_c_16461" AFTER UPDATE ON t2 FROM t1 NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE FUNCTION "RI_FKey_check_upd"()

“ALL” means, please also disable the internal triggers that are responsible for verifying the constraints. One restriction of the “ALL” keyword is, that you need to be superuser for doing that. Trying that with a normal user will fail:

postgres=# create user u1 with login password 'u1';
CREATE ROLE
postgres=# \c postgres u1
You are now connected to database "postgres" as user "u1".
postgres=> create table t3 ( a int primary key
postgres(>                 , b text
postgres(>                 , c date
postgres(>                 );
CREATE TABLE
postgres=> create table t4 ( a int primary key
postgres(>                 , b int references t3(a)
postgres(>                 , c text
postgres(>                 );
CREATE TABLE
postgres=> alter table t4 disable trigger all;
ERROR:  permission denied: "RI_ConstraintTrigger_c_16484" is a system trigger
postgres=> 

What you could do as a regular user to do disable the user triggers:

postgres=> alter table t4 disable trigger user;
ALTER TABLE

As I do not have any triggers it of course does not make much sense. Coming back to our initial t1 and t2 tables. As the foreign key currently is disabled we can insert data into the t2 table that would violate the constraint:

postgres=# select * from t1;
 a | b  |     c      
---+----+------------
 1 | aa | 2019-11-27
 2 | bb | 2019-11-27
(2 rows)

postgres=# select * from t2;
 a | b | c  
---+---+----
 1 | 1 | aa
 2 | 2 | aa
(2 rows)

postgres=# insert into t2 (a,b,c) values (3,3,'cc');
INSERT 0 1
postgres=# 

There clearly is no matching parent for this row in the t1 table but the insert succeeds, as the foreign key is disabled. Time to validate the constraint:

postgres=# \d t2
                 Table "public.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | integer |           |          | 
 c      | text    |           |          | 
Indexes:
    "t2_pkey" PRIMARY KEY, btree (a)
Foreign-key constraints:
    "t2_b_fkey" FOREIGN KEY (b) REFERENCES t1(a)
Disabled internal triggers:
    "RI_ConstraintTrigger_c_16460" AFTER INSERT ON t2 FROM t1 NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE FUNCTION "RI_FKey_check_ins"()
    "RI_ConstraintTrigger_c_16461" AFTER UPDATE ON t2 FROM t1 NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE FUNCTION "RI_FKey_check_upd"()

postgres=# alter table t2 enable trigger all;
ALTER TABLE
postgres=# \d t2
                 Table "public.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | integer |           |          | 
 c      | text    |           |          | 
Indexes:
    "t2_pkey" PRIMARY KEY, btree (a)
Foreign-key constraints:
    "t2_b_fkey" FOREIGN KEY (b) REFERENCES t1(a)

postgres=# alter table t2 validate constraint t2_b_fkey;
ALTER TABLE
postgres=# 

Surprise, surprise, PostgreSQL does not complain about the invalid row. Why is that? If we ask the pg_constraint catalog table the constraint is recorded as validated:

postgres=# select convalidated from pg_constraint where conname = 't2_b_fkey' and conrelid = 't2'::regclass;
 convalidated 
--------------
 t
(1 row)

It is even validated if we disable it once more:

postgres=# alter table t2 disable trigger all;
ALTER TABLE
postgres=# select convalidated from pg_constraint where conname = 't2_b_fkey' and conrelid = 't2'::regclass;
 convalidated 
--------------
 t
(1 row)

That implies that PostgreSQL will not validate the constraint when we enable the internal triggers and PostgreSQL will not validate all the data as long as the status is valid. What we really need to do for getting the constraint validated is to invalidate it before:

postgres=# alter table t2 alter CONSTRAINT t2_b_fkey not valid;
ERROR:  ALTER CONSTRAINT statement constraints cannot be marked NOT VALID

Seems this is not the correct way of doing it. The correct way of doing it is to drop the foreign key and then re-create it with status invalid:

postgres=# alter table t2 drop constraint t2_b_fkey;
ALTER TABLE
postgres=# delete from t2 where a in (3,4);
DELETE 2
postgres=# alter table t2 add constraint t2_b_fkey foreign key (b) references t1(a) not valid;
ALTER TABLE
postgres=# \d t2
                 Table "public.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | integer |           |          | 
 c      | text    |           |          | 
Indexes:
    "t2_pkey" PRIMARY KEY, btree (a)
Foreign-key constraints:
    "t2_b_fkey" FOREIGN KEY (b) REFERENCES t1(a) NOT VALID

Now we have the desired state and we can insert our data:

postgres=# insert into t2(a,b,c) values (3,3,'cc');
ERROR:  insert or update on table "t2" violates foreign key constraint "t2_b_fkey"
DETAIL:  Key (b)=(3) is not present in table "t1".

Surprise, again. Creating a “not valid” constraint only tells PostgreSQL not to scan the whole table to validate if all the rows are valid. For data inserted or updated the constraint is still checked, and this is why the insert fails.

What options do we have left? The obvious one is this:

  • Drop all the foreign the keys.
  • Load the data.
  • Re-create the foreign keys, but leave them invalid to avoid the costly scan of the tables. Now data will be validated.
  • Validate the constraints when there is less load on the system.

Another possibility would be this:

postgres=# alter table t2 alter constraint t2_b_fkey deferrable;
ALTER TABLE
postgres=# begin;
BEGIN
postgres=# set constraints all deferred;
SET CONSTRAINTS
postgres=# insert into t2 (a,b,c) values (3,3,'cc');
INSERT 0 1
postgres=# insert into t2 (a,b,c) values (4,4,'dd');
INSERT 0 1
postgres=# insert into t1 (a,b,c) values (3,'cc',now());
INSERT 0 1
postgres=# insert into t1 (a,b,c) values (4,'dd',now());
INSERT 0 1
postgres=# commit;
COMMIT

The downside of this is that this only works until the next commit, so you have to do all your work in one transaction. The key point of this post is, that the assumption that following will validate your data is false:

postgres=# alter table t2 disable trigger all;
ALTER TABLE
postgres=# insert into t2 (a,b,c) values (5,5,'ee');
INSERT 0 1
postgres=# alter table t2 enable trigger all;
ALTER TABLE
postgres=# 

This will only validate new data but it does not guarantee that all the rows satisfy the constraint:

postgres=# insert into t2 (a,b,c) values (6,6,'ff');
ERROR:  insert or update on table "t2" violates foreign key constraint "t2_b_fkey"
DETAIL:  Key (b)=(6) is not present in table "t1".
postgres=# select * from t2 where b = 5;
 a | b | c  
---+---+----
 5 | 5 | ee
(1 row)

postgres=# select * from t1 where a = 5;
 a | b | c 
---+---+---
(0 rows)

Finally: There is another way of doing it, but this directly updates the pg_constraint catalog table and this is something you should _not_ do (never update internal tables directly!):

postgres=# delete from t2 where b = 5;
DELETE 1
postgres=# delete from t2 where b = 5;
DELETE 1
postgres=# alter table t2 disable trigger all;
ALTER TABLE
postgres=# insert into t2 values (5,5,'ee');
INSERT 0 1
postgres=# alter table t2 enable trigger all;
ALTER TABLE
postgres=# update pg_constraint set convalidated = false where conname = 't2_b_fkey' and conrelid = 't2'::regclass;
UPDATE 1
postgres=# alter table t2 validate constraint t2_b_fkey;
ERROR:  insert or update on table "t2" violates foreign key constraint "t2_b_fkey"
DETAIL:  Key (b)=(5) is not present in table "t1".
postgres=# 

In this case the constraint will be fully validated as it is recorded as invalid in the catalog.

Conclusion: Do not rely on assumptions, always carefully test your procedures.

Cet article Enabling, disabling, and validating foreign key constraints in PostgreSQL est apparu en premier sur Blog dbi services.

A schema and a user are not the same in PostgreSQL

Wed, 2019-11-27 11:56

When people with an Oracle background attend our PostgreSQL DBA Essentials training there is always a bit of confusion about schemas and users. In Oracle a schema and a user is a one to one relationship and there is no real distinction between a user and a schema. In PostgreSQL the situation is different: All the objects a user is creating are created in a specific schema (or namespace). Other users may or may not have permissions to work with this objects or even to create new objects in a specific schema. Compared to Oracle there is one layer more.

The hierarchy in PostgreSQL is this:

|-------------------------------------------|---|
| PostgreSQL instance                       |   |
|-------------------------------------------| U |
|     Database 1      |     Database 2      | S |
|---------------------|---------------------| E |
| Schema 1 | Schema 2 | Schema 1 | Schema 2 | R |
|----------|----------|----------|----------| S |
| t1,t2,t3 | t1,t2,t3 | t1,t2,t3 | t1,t2,t3 |   |
-------------------------------------------------

What this little ASCII image shall tell you: Users (and roles) in PostgreSQL are global objects and are not defined in a database but on the instance level. Schemas are created by users in a specific database and contain database objects. Where a lot of people get confused with is this:

postgres@centos8pg:/home/postgres/ [pgdev] psql -X postgres
psql (13devel)
Type "help" for help.

postgres=# create table t1 ( a int );
CREATE TABLE
postgres=# 

Nothing in this create table statement references a schema but according to what I just said above all objects must go to a schema. Where did this table go then? Each PostgreSQL database comes with a public schema by default and if you do not explicitly specify a schema the new object will go there. There are several ways of asking PostgreSQL for the schema of a given table but probably the two most used ones are these (the first one is asking a catalog view and the second one is using a psql shortcut)

postgres=# select schemaname from pg_tables where tablename = 't1';
 schemaname 
------------
 public
(1 row)

postgres=# \d t1
                 Table "public.t1"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           |          | 

Btw: The public schema is a special schema in PostgreSQL and you should either remove it or at least revoke permission from public on the public schema. Check here for more information on that.

So what happens when you drop the public schema and try to create a table afterwards?

postgres=# drop schema public cascade;
NOTICE:  drop cascades to table t1
DROP SCHEMA
postgres=# create table t1 ( a int );
ERROR:  no schema has been selected to create in
LINE 1: create table t1 ( a int );
                     ^
postgres=# 

As we do not have a single schema anymore:

postgres=# \dn
List of schemas
 Name | Owner 
------+-------
(0 rows)

… PostgreSQL has no idea where to put the table. At this point it should already be clear that a schema in PostgreSQL is not the same as a user. We are connected as the “postgres” user, but we do not have a schema to create our objects in. Lets create the first schema and right afterwards the same table as above:

postgres=# create schema my_schema;
CREATE SCHEMA
postgres=# create table t1 ( a int );
ERROR:  no schema has been selected to create in
LINE 1: create table t1 ( a int );
                     ^
postgres=# 

… again PostgreSQL is not able to create the table. The question is: Why did it work when then public schema was there? We did not specify the public schema above but it worked. This is where the search_path comes into the game:

postgres=# show search_path;
   search_path   
-----------------
 "$user", public
(1 row)

postgres=# 

By default the search_path contains you current username and public. As none of these schemas exist right now the create table statement will fail. There are two options to fix that. Either use the fully qualified name:

postgres=# create table my_schema.t1 ( a int );
CREATE TABLE
postgres=# \d my_schema.t1
               Table "my_schema.t1"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           |          | 

… or adjust the search_path so that your preferred schema comes first:

postgres=# set search_path = 'my_schema',"$user",public;
SET
postgres=# show search_path ;
        search_path         
----------------------------
 my_schema, "$user", public
(1 row)

postgres=# create table t2 ( a int );
CREATE TABLE
postgres=# \d t2
               Table "my_schema.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           |          | 

postgres=# 

That all might look a bit strange at the beginning, especially when you are used to Oracle, but it also provides great flexibility:

  • A user can create many different schemas, no need to create separate users
  • A user can grant permission to create objects in one of his schemas to someone else
  • You can logically divide your application
  • (no, there are no synonyms in PostgreSQL)
  • The are default privileges you can use

Cet article A schema and a user are not the same in PostgreSQL est apparu en premier sur Blog dbi services.

DOAG 2019: Cloud or Kubernetes on premise and CI/CD pipeline at the top of (my) interest

Fri, 2019-11-22 11:33

The DOAG 2019 is just over now and if I check the subject of sessions I was attending, I have the feeling the Cloud and Kubernetes on-premise deployments and CI/CD pipelines were the top subjects interesting people or may be just me !!

I started by a devOps session and then followed a lot of Kubernetes or containers related sessions. The three following sessions are those I directly think about when trying to summarize.

“The Pillars of Continuous Delivery” – Lykle Thijssen

A very interesting session about continuous delivery.

“How to create a solid foundation for Continuous Delivery, based on four pillars: Agile, Microservices, Automation and Cloud. Lykle explained how agile development can reduce time to production by short development cycles, how Microservices make the deployment process easier with less dependencies, how automation can improve test, build and deployment and how cloud can help with the necessary infrastructure. It all hangs together and without a solid foundation, there is always the risk of building a house of cards.”

“Containers Demystified” – Jan Karremans and Daniel Westermann

“If you are looking at implementing Dev/Ops, infrastructure as a code, if you want to adopt the Cloud or seeking to go with Microservices, you will find Containers on you path. After the opportunities that virtualization brought, Containers are the next best thing! Also (and perhaps specifically) looking at databases with containers, it brings specific challenges. Where containers infrastructures are built to fail and database rely on persistency, you have challenges.”

Watch My Services: Prometheus in Kubernetes – Thorsten Wussow

“The monitoring of microservices is a special challenge. In the lecture the problem of a monitoring of microservices will be discussed. Furthermore, various products are briefly considered, with which one can still carry out such a monitoring. Then a demo shows how to set up monitoring in a Kubernetes cluster with the help of Prometheus and Grafana and what to look for.”

A Kubernetes cluster was used in the Oracle Infrastructure Cloud to demonstrate the deployment and the configuration of Promotheus and Grafana.
With Thorsten, it always looks simple when following the demonstration. Now it is is time to implement it.

Kubernetes im Vergleich: Google, AWS, Oracle, Microsoft – Michael Schulze & Borys Neselovskyi

“The presentation gives a brief overview of the architecture and construction of Kubernetes. In addition, useful application scenarios are presented in selected use cases. Later in the talk, we will compare existing Kubernetes cloud solutions from leading vendors (Google, AWS, Oracle and Microsoft). Here, a customer scenario is used as the basis for the comparison. The following criteria play a role: installation, maintenance, performance, monitoring and, of course, costs.”

A nice comparison between the different cloud solutions.

There were some other very interesting sessions but I will not list all of them now.

DOAG 2019 is over. See you next year there in Nuremberg

Cet article DOAG 2019: Cloud or Kubernetes on premise and CI/CD pipeline at the top of (my) interest est apparu en premier sur Blog dbi services.

NFS Storage Configuration for Kubernetes

Tue, 2019-11-19 09:08

For one of our important customers, we are working on a major project to migrate critical applications to containers. From the implementation of the Kubernetes architecture to the deployment of applications and the administration of the platform, we are responsible for an important technological stack with new challenges for our team.

One of the challenges, both important and exciting, is the implementation of Kubernetes clusters on bare metal (VM) and its management. We have deployed a Kubernetes cluster in VMs, based on VMWare.

As you know, one of the challenges of containerization is storage management. Do we manage stateless or stateful applications? For stateful applications, the way the data generated by the application is stored is very important.

Therefore, based on our infrastructure, we have 2 possibilities:

Here is a representative diagram of the 2 solutions:

Configuring NFS storage for Kubernetes

The Kubernetes infrastructure is composed of the following:

  • k8s-master
  • k8s-worker1
  • k8s-worker2

In addition, we have an NFS server to store our cluster data. In the next steps, we are going to expose the NFS share as a cluster object. We will create Kubernetes Persistent Volumes and Persistent Volume Claims for our application.

Persistent Volume Creation

Define the persistent volume at the cluster level as following:

[ec2-user@ip-10-3-1-217 ~]$ vi create-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-demo
labels:
app: nfs
type: data
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
volumeMode: Filesystem
nfs:
path: /home/ec2-user/data
server: ec2-3-88-194-14.compute-1.amazonaws.com
persistentVolumeReclaimPolicy: Retain

Create the persistent volume and see the results:

[ec2-user@ip-10-3-1-217 ~]$ kubectl create -f create-pv.yaml
persistentvolume/nfs-demo created
[ec2-user@ip-10-3-1-217 ~]$ kubectl get pv
NAME       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
nfs-demo   10Gi       RWO            Retain           Available                                   7s

Once it’s created we can create a persistent volume claim. A PVC is dedicated to a specific namespace.
First, create the nfs-demo namespace, then the PVC.

[ec2-user@ip-10-3-1-217 ~]$ kubectl create ns nfs-demo
namespace/nfs-demo created
[ec2-user@ip-10-3-1-217 ~]$ vi create-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nfs-demo
  namespace: nfs-demo
  labels:
   app: nfs
spec:
  accessModes:
    - ReadWriteOnce
  resources:
     requests:
       storage: 10Gi
  selector:
    matchLabels:
      app: nfs
      type: data
[ec2-user@ip-10-3-1-217 ~]$ kubectl create -f create-pvc.yaml
persistentvolumeclaim/nfs-demo created

[ec2-user@ip-10-3-1-217 ~]$ kubectl get pvc -n nfs-demo
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
nfs-demo Bound nfs-demo 10Gi RWO 3m21s

We can see now that our persistent volume changes its status from “Available” to “Bound”.

[ec2-user@ip-10-3-1-217 ~]$ kubectl get pv
NAME       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM               STORAGECLASS   REASON   AGE
nfs-demo   10Gi       RWO            Retain           Bound    nfs-demo/nfs-demo                           169m

Finally, let’s deploy now our workload which will consume the volume claim and the persistent volume. Whatever the workload API object you are using (Deployment, StatefulSet or DaemonSet) the Persistent Volume Claim is defined within the Pod specification, as follows:

[ec2-user@ip-10-3-1-217 ~]$ vi create-pod.yaml

kind: Pod
[ec2-user@ip-10-3-1-217 ~]$ packet_write_wait: Connection to 18.205.188.55 port 22: Broken pipe
kind: Pod
apiVersion: v1
metadata:
  name: nfs-pod
spec:
  containers:
    - name: nfs-demo
      image: alpine
      volumeMounts:
      - name: nfs-demo
          mountPath: /data/nfs
      command: ["/bin/sh"]
      args: ["-c", "sleep 500000"]
  volumes:
  - name: nfs-demo
    persistentVolumeClaim:
      claimName: nfs-demo
[ec2-user@ip-10-3-1-217 ~]$ kubectl create -f create-pod.yaml
pod/nfs-pod created

[ec2-user@ip-10-3-1-217 ~]$ kubectl get pods -o wide -n nfs-demo
NAME      READY   STATUS    RESTARTS   AGE   IP              NODE                         NOMINATED NODE   READINESS GATES
nfs-pod   1/1     Running   0          9s    192.168.37.68   ip-10-3-0-143.ec2.internal              

Let’s now create an empty file into the container volume mount path and see if it is has been created on the NFS server.

[ec2-user@ip-10-3-1-217 ~]$ kubectl -n nfs-demo exec nfs-pod touch /data/test-nfs.sh

We can see now, in the NFS server that the file has been properly stored.

mehdi@MacBook-Pro: ssh -i "dbi.pem" ec2-user@ec2-3-88-194-14.compute-1.amazonaws.com
Last login: Tue Nov 19 13:35:18 2019 from 62.91.42.92

       __|  __|_  )
       _|  (     /   Amazon Linux 2 AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-2/
16 package(s) needed for security, out of 27 available
Run "sudo yum update" to apply all updates.

[ec2-user@ip-10-3-0-184 ~]$ ls -lrt data/
total 0
-rw-r--r-- 1 root root 0 Nov 19 13:42 test-nfs.sh

Cet article NFS Storage Configuration for Kubernetes est apparu en premier sur Blog dbi services.

Fun with arrays in PostgreSQL

Mon, 2019-11-18 00:30

As you might already know, PostgreSQL comes with many, many data types. What you might not know is, that you can create arrays over all this data types quite easily. Is that important? Well, as always it depends on your requirements but there are use cases where arrays are great. As always, lets do some simple tests.

The following will create very simple table with one column, which is a one-dimensional array of integers.

postgres=# \d t1
                  Table "public.t1"
 Column |   Type    | Collation | Nullable | Default 
--------+-----------+-----------+----------+---------
 a      | integer[] |           |          | 

To insert data into that table you would either do it like this:

postgres=# insert into t1 (a) values ( '{1,2,3,4,5,6}' );
INSERT 0 1
postgres=# select * from t1;
       a       
---------------
 {1,2,3,4,5,6}
(1 row)

… or you can do it like this as well:

postgres=# insert into t1 (a) values ( ARRAY[1,2,3,4,5,6] );
INSERT 0 1
postgres=# select * from t1;
       a       
---------------
 {1,2,3,4,5,6}
 {1,2,3,4,5,6}
(2 rows)

Notice that I did not specify any size of the array. Although you can do that:

postgres=# create table t2 ( a int[6] );
CREATE TABLE

… the limit is not enforced by PostgreSQL:

postgres=# insert into t2 (a) values ( '{1,2,3,4,5,6,7,8}' );
INSERT 0 1
postgres=# select * from t2;
         a         
-------------------
 {1,2,3,4,5,6,7,8}
(1 row)

PostgreSQL does not limit you to one-dimensional arrays, you can well go ahead and create more dimensions:

postgres=# create table t3 ( a int[], b int[][], c int[][][] );
CREATE TABLE
postgres=# \d t3
                  Table "public.t3"
 Column |   Type    | Collation | Nullable | Default 
--------+-----------+-----------+----------+---------
 a      | integer[] |           |          | 
 b      | integer[] |           |          | 
 c      | integer[] |           |          | 

Although it does look like all of the columns are one-dimensional they are actually not:

postgres=# insert into t3 (a,b,c) values ( '{1,2,3}', '{{1,2,3},{1,2,3}}','{{{1,2,3},{1,2,3},{1,2,3}}}' );
INSERT 0 1
postgres=# select * from t3;
    a    |         b         |              c              
---------+-------------------+-----------------------------
 {1,2,3} | {{1,2,3},{1,2,3}} | {{{1,2,3},{1,2,3},{1,2,3}}}
(1 row)

In reality those array columns are not really one-dimensional, you can create as many dimensions as you like even when you think you created one dimension only:

postgres=# create table t4 ( a int[] );
CREATE TABLE
postgres=# insert into t4 (a) values ( '{1}' );
INSERT 0 1
postgres=# insert into t4 (a) values ( '{1,2}' );
INSERT 0 1
postgres=# insert into t4 (a) values ( '{{1,2},{1,2}}' );
INSERT 0 1
postgres=# insert into t4 (a) values ( '{{{1,2},{1,2},{1,2}}}' );
INSERT 0 1
postgres=# insert into t4 (a) values ( '{{{{1,2},{1,2},{1,2},{1,2}}}}' );
INSERT 0 1
postgres=# select * from t4;
               a               
-------------------------------
 {1}
 {1,2}
 {{1,2},{1,2}}
 {{{1,2},{1,2},{1,2}}}
 {{{{1,2},{1,2},{1,2},{1,2}}}}
(5 rows)

Now that there are some rows: how can we query that? This matches the first two rows of the table:

postgres=# select ctid,* from t4 where a[1] = 1;
 ctid  |   a   
-------+-------
 (0,1) | {1}
 (0,2) | {1,2}
(2 rows)

This matches the second row only:

postgres=# select ctid,* from t4 where a = '{1,2}';
 ctid  |   a   
-------+-------
 (0,2) | {1,2}
(1 row)

This matches row three only:

postgres=# select ctid, * from t4 where a[1:2][1:3] = '{{1,2},{1,2}}';
 ctid  |       a       
-------+---------------
 (0,3) | {{1,2},{1,2}}
(1 row)

You can even index array data types by using a GIN index:

postgres=# create index i1 ON t4 using gin (a);
CREATE INDEX
postgres=# \d t4
                  Table "public.t4"
 Column |   Type    | Collation | Nullable | Default 
--------+-----------+-----------+----------+---------
 a      | integer[] |           |          | 
Indexes:
    "i1" gin (a)

This does not make much sense right now is we do not have sufficient data for PostgreSQL considering the index, but a as soon as we have more data the index will be helpful:

postgres=# insert into t4 select '{{1,2},{1,2}}' from generate_series(1,1000000);
INSERT 0 1000000
postgres=# explain select ctid,* from t4 where a = '{1,2}';
                            QUERY PLAN                            
------------------------------------------------------------------
 Bitmap Heap Scan on t4  (cost=28.00..32.01 rows=1 width=51)
   Recheck Cond: (a = '{1,2}'::integer[])
   ->  Bitmap Index Scan on i1  (cost=0.00..28.00 rows=1 width=0)
         Index Cond: (a = '{1,2}'::integer[])
(4 rows)

In addition to that PostgreSQL comes with many support functions for working with arrays, e.g. to get the length of an array:

postgres=# select array_length(a,1) from t4 limit 2;
 array_length 
--------------
            1
            2

As I mentioned at the beginning of this post you can create arrays of all kinds of data types, not only integers:

postgres=# create table t5 ( a date[], b timestamp[], c text[], d point[], e boolean[] );
CREATE TABLE
postgres=# \d t5
                            Table "public.t5"
 Column |             Type              | Collation | Nullable | Default 
--------+-------------------------------+-----------+----------+---------
 a      | date[]                        |           |          | 
 b      | timestamp without time zone[] |           |          | 
 c      | text[]                        |           |          | 
 d      | point[]                       |           |          | 
 e      | boolean[]                     |           |          | 

Whatever you want. You can even create arrays over user typed types:

postgres=# create type type1 as ( a int, b text );
CREATE TYPE
postgres=# create table t6 ( a type1[] );
CREATE TABLE
postgres=# \d t6
                 Table "public.t6"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | type1[] |           |          | 

Quite powerful.

Cet article Fun with arrays in PostgreSQL est apparu en premier sur Blog dbi services.

A day of conferences with the Swiss Oracle User Group

Sun, 2019-11-17 10:00
Introduction

I’m not that excited with all these events arround Oracle technologies (and beyond) but it’s always a good place to learn new things, and maybe the most important, discover new ways of thinking. And regarding this point, I was not disappointed.

Franck Pachot: serverless and distributed database

Franck talked about scaling out, it means avoid monoliths. Most of the database servers are this kind of monoliths today. And he advises us to think microservices. It’s not so easy regarding the database component, but it could surely simplify the management of different modules through different developper teams. Achieving scaling out is also get rid of these old cluster technologies (think about RAC) and instead of that, adopt the “sharing nothing”: no storage sharing, no network sharing, etc.
It also means the need for db replication, and also scale of the writes: and that point is more complicated. Sharding is a key point for scaling out (put the associated data where the users resides).

I discovered the CAP theorem, a very interesting theory that shows us that there is actually no ultimate solution. You need to choose your priority: Consistancy and Availability, or Availability and Partition Tolerant or Consistency and Partiton Tolerant. Just remind to keep your database infrastructure adapted to your needs, a google-like infrastructure being probably nice but do you really need the same?

Kamran Aghayer: Transition from dba to data engineer

Times are changing. I knew that since several years, but now it’s like an evidence: as a traditional DBA, I will soon be deprecated. Old-school DBA jobs will be replaced by a lot of new jobs: data architect, data engineer, data analyst, data scientist, machine learning engineer, AI engineer, …

Kamran focused on Hadoop ecosystem and Spark especially when he needed to archive data from EXADATA to HADOOP (and explained how HADOOP manage data through HDFS filesystem and datanodes – sort of ASM). He used a dedicated connector, sort of wrapper using external tables. Actually this is also what’s inside the Big Data Appliance from Oracle. This task was out of the scope of a traditional DBA, as a good knowledge of the data was needed. So, traditionnal DBA is dead.

Stefan Oehrli – PDB isolation and security

Since Oracle announced the availability of 3 free PDBs with each container database, the interest for Multitenant increased.

We had an overview of the top 10 security risks, all about privileges, privilege abuse, unauthorized privileges elevation, platform vulnerability, sql injection, etc. If you’re already in the cloud with PAAS or DBAAS, risks are the same.

We had a presentation of several clues for risk mitigation:
– path_prefix: it’s some kind of chroot for the PDB
– PDB_os_credential (still bugs but…): concerns credentials and dbms_scheduler
– lockdown profiles: a tool for restricting database features like queuing, partitioning, Java OS access, altering the database. Restrictions working with inclusion or exclusion.

Paolo Kreth and Thomas Bauman: The role of the DBA in the era of Cloud, Multicloud and Autonomous Database

Already heard today that the classic DBA is soon dead. And now the second bullet. The fact is that Oracle worked hard to improve autonomous features during the last 20 years, and like it was presented, you realize that it’s clearly true. Who cares about extents management now?

But there is still a hope. DBA of tomorrow is starting today. As the DBA role actually sits between infrastructure team and data scientists, there is a way to architect your career. Keep a foot in technical stuff, but become a champion in data analysis and machine learning.

Or focus on development with opensource and cloud. The DBA job can shift, don’t miss this opportunity.

Nikitas Xenakis – MAA with 19c and GoldenGate 19c: a real-world case study

Hey! Finally, the DBA is not dead yet! Some projects still need technical skills and complex architecture. The presented project was driven by dowtime costs, and for some kind of businesses, a serious downtime can kill the company. The customer concerned by this project cannot afford more than 1h of global downtime.

We had an introduction of MAA (standing for Maximum Availability Architecture – see Oracle documentation for that).

You first need to estimate:
– the RPO: how much data you can afford to loose
– the RTO: how quick you’ll be up again
– the performance you expect after the downtime: because it matters

The presented infrastructure was composed of RHEL, RAC with Multitenant (1 PDB only), Acitve Data Guard and GoldenGate. The middleware was not from Oracle but configured to work with Transparent Application Failover.

For sure, you still need several old-school DBA’s to setup and manage this kind of infrastructure.

Luiza Nowak: Error when presenting your data

You can refer to the blog from Elisa USAI for more information.

For me, it was very surprising to discover how a presentation can be boring, confusing, missing the point just because of inappropriate slides. Be precise, be captivating, make use of graphics instead of sentences, make good use of the graphics, if you want your presentation to have the expected impact.

Julian Frey: Database cloning in a multitenant environment

Back to pure DBA stuff. Quick remind of why we need to clone, and what we need to clone (data, metadata, partial data, refreshed data only, anonymised data, etc). And now, always considering GDPR compliance!

Cloning before 12c was mainly done with these well known tools: rman duplicate, datapump, GoldenGate, dblinks, storage cloning, embedded clone.pl script (didn’t heard about this one before).

Starting from 12c, and only if you’re using multitenant, new convenient tools are available for cloning: PDB snapshot copy, snapshot carousel, refreshable copy, …

I discovered that you can duplicate a PDB without actually putting the source PDB in read only mode: you just need to put your source PDB in begin backup mode, copy the files, generate the metadata file and create the database with resetlogs. Nice feature.

You have to know that cloning a PDB is native with multitenant, a database being always a clone of something (at least an empty PDB is created from PDB$seed).

Note that Snapshot copy of a PDB is limited for some kind of filesystems, the most known being nfs and acfs. If you decide to go for multitenant without actually having the option, don’t forget to limit the maximum of PDB in your CDB settings. It’s actually a parameter: max_PDBs. Another interesting feature is the possibily to create a PDB from a source PDB without the data (but tablespace and tables are created).

Finally, and against all odds, datapump is still a great tool for most of the cases. You’d better still consider this tool too.

Conclusion

This was a great event, from great organizers, and if pure Oracle DBA is probably not a job that makes younger people dream, jobs dealing with data are not planned to disappear in the near future.

Cet article A day of conferences with the Swiss Oracle User Group est apparu en premier sur Blog dbi services.

PostgreSQL 12 : Setting Up Streaming Replication

Sat, 2019-11-16 11:29

PostgreSQL 12 was released a few month ago. When actually setting up a replication, there is no longer recovery.conf file in the PGDATA. Indeed all parameters of the recovery.conf should be now in the postgresql.conf file. And in the cluster data directory of the standby server, therre should be a file named standby.signal to trigger the standby mode.
In this blog I am just building a streaming replication between 2 servers to show these changes. The configuration we are using is
Primary server dbi-pg-essentials : 192.168.56.101
Standby server dbi-pg-essentials-2 : 192.168.56.102

The primary server is up and running on dbi-pg-essentials server.

postgres@dbi-pg-essentials:/u02/pgdata/12/PG1/ [PG12] pg12

********* dbi services Ltd. *********
                  STATUS  : OPEN
         ARCHIVE_COMMAND  : test ! -f /u99/pgdata/12/archived_wal/%f && cp %p /u99/pgdata/12/archived_wal/%f
            ARCHIVE_MODE  : on
    EFFECTIVE_CACHE_SIZE  : 4096MB
                   FSYNC  : on
          SHARED_BUFFERS  : 128MB
      SYNCHRONOUS_COMMIT  : on
                WORK_MEM  : 4MB
              IS_STANDBY  : false
*************************************

postgres@dbi-pg-essentials:/u02/pgdata/12/PG1/ [PG12]

step 1 : Prepare the user for the replication on the primay server
For streaming replication, we need a user to read the WAL stream, we can do it with a superuser but it is not required. We will create a user with REPLICATION and LOGIN privileges. Contrary to the SUPERUSER privilege, the REPLICATION privilege gives very high permissions but does not allow to modifiy any data.
Here we will create a user named repliuser

postgres=# create user repliuser with password 'postgres'  replication ;
CREATE ROLE
postgres=#

Step 2 : Prepare the authentication on the primary server
The user used for the replication should be allowed to connect for replication. We need then to adjust the pg_hba.conf file for the two servers.

postgres@dbi-pg-essentials:/u02/pgdata/12/PG1/ [PG12] grep repliuser pg_hba.conf
host    replication     repliuser        192.168.56.101/32        md5
host    replication     repliuser        192.168.56.102/32        md5
postgres@dbi-pg-essentials:/u02/pgdata/12/PG1/ [PG12]

Step 3 : Create a replication slot on the primary server
Replication slots provide an automated way to ensure that the master does not remove WAL segments until they have been received by all standbys, and that the master does not remove rows which could cause a recovery conflict even when the standby is disconnected.

psql (12.1 dbi services build)
Type "help" for help.

postgres=# SELECT * FROM pg_create_physical_replication_slot('pg_slot_1');
 slot_name | lsn
-----------+-----
 pg_slot_1 |
(1 row)

postgres=#

Step 4 : Do a backup of the primary database and restore it on the standby
From the standby server launch the following command

postgres@dbi-pg-essentials-2:/u02/pgdata/12/PG1/ [PG12] pg_basebackup -h 192.168.56.101 -D /u02/pgdata/12/PG1 --wal-method=fetch -U repliuser

Step 5 : set the primary connection info for the streaming on standby side
The host name and port number of the primary, connection user name, and password are specified in the primary_conninfo. Here there is a little change as there is no longer a recovery.conf parameter. The primary_conninfo should now be specified in the postgresql.conf

postgres@dbi-pg-essentials-2:/u02/pgdata/12/PG1/ [PG12] grep primary postgresql.conf
primary_conninfo = 'host=192.168.56.101 port=5432 user=repliuser password=postgres'
primary_slot_name = 'pg_slot_1'                 # replication slot on sending server

Step 6 : Create the standby.signal file on standby server
In the cluster data directory of the standby, create a file standby.signal

postgres@dbi-pg-essentials-2:/u02/pgdata/12/PG1/ [PG12] pwd
/u02/pgdata/12/PG1
postgres@dbi-pg-essentials-2:/u02/pgdata/12/PG1/ [PG12] touch standby.signal

Step 7 : Then start the standby cluster

postgres@dbi-pg-essentials-2:/u02/pgdata/12/PG1/ [PG12] pg_ctl start

If everything is fine, you should fine in the alert log

2019-11-16 17:41:21.552 CET [1590] LOG:  database system is ready to accept read only connections
2019-11-16 17:41:21.612 CET [1596] LOG:  started streaming WAL from primary at 0/5000000 on timeline 1

As confirmed by dbi dmk tool, the master is now streaming to the standby server

********* dbi services Ltd. *********
                  STATUS  : OPEN
         ARCHIVE_COMMAND  : test ! -f /u99/pgdata/12/archived_wal/%f && cp %p /u99/pgdata/12/archived_wal/%f
            ARCHIVE_MODE  : on
    EFFECTIVE_CACHE_SIZE  : 4096MB
                   FSYNC  : on
          SHARED_BUFFERS  : 128MB
      SYNCHRONOUS_COMMIT  : on
                WORK_MEM  : 4MB
              IS_STANDBY  : false
               IS_MASTER  : YES, streaming to 192.168.56.102/32
*************************************

postgres@dbi-pg-essentials:/u02/pgdata/12/PG1/ [PG12]

Cet article PostgreSQL 12 : Setting Up Streaming Replication est apparu en premier sur Blog dbi services.

Elapsed time of Oracle Parallel Executions are not shown correctly in AWR

Fri, 2019-11-15 10:08

As the elapsed time (time it takes for a task from start to end, often called wall-clock time) per execution of parallel queries are not shown correctly in AWR-reports, I thought I setup a testcase to find a way to get an elapsed time closer to reality.

REMARK: To use AWR (Automatic Workload Repository) and ASH (Active Session History) as described in this Blog you need to have the Oracle Diagnostics Pack licensed.

I created a table t5 with 213K blocks:

SQL> select blocks from tabs where table_name='T5';
 
    BLOCKS
----------
    213064

In addition I enabled Linux-IO-throttling with 300 IOs/sec through a cgroup on my device sdb to ensure the parallel-statements take a couple of seconds to run:

[root@19c ~]# CONFIG_BLK_CGROUP=y
[root@19c ~]# CONFIG_BLK_DEV_THROTTLING=y
[root@19c ~]# echo "8:16 300" > /sys/fs/cgroup/blkio/blkio.throttle.read_iops_device

After that I ran my test:

SQL> select sysdate from dual;
 
SYSDATE
-------------------
14.11.2019 14:03:51
 
SQL> exec dbms_workload_repository.create_snapshot;
 
PL/SQL procedure successfully completed.
 
SQL> set timing on
SQL> select /*+ parallel(t5 2) full(t5) */ count(*) from t5;
 
  COUNT(*)
----------
  50403840
 
Elapsed: 00:00:05.63
SQL> select /*+ parallel(t5 2) full(t5) */ count(*) from t5;
 
  COUNT(*)
----------
  50403840
 
Elapsed: 00:00:05.62
SQL> select /*+ parallel(t5 2) full(t5) */ count(*) from t5;
 
  COUNT(*)
----------
  50403840
 
Elapsed: 00:00:05.84
SQL> select /*+ parallel(t5 2) full(t5) */ count(*) from t5;
 
  COUNT(*)
----------
  50403840
 
Elapsed: 00:00:05.73
SQL> select /*+ parallel(t5 2) full(t5) */ count(*) from t5;
 
  COUNT(*)
----------
  50403840
 
Elapsed: 00:00:05.63
SQL> select /*+ parallel(t5 2) full(t5) */ count(*) from t5;
 
  COUNT(*)
----------
  50403840
 
Elapsed: 00:00:05.74
SQL> exec dbms_workload_repository.create_snapshot;
 
PL/SQL procedure successfully completed.

Please consider the elapsed time of about 5.7 seconds per execution.

The AWR-report shows the following in the “SQL ordered by Elapsed time”-section:

        Elapsed                  Elapsed Time
        Time (s)    Executions  per Exec (s)  %Total   %CPU    %IO    SQL Id
---------------- -------------- ------------- ------ ------ ------ -------------
            67.3              6         11.22   94.5   37.4   61.3 04r3647p2g7qu
Module: SQL*Plus
select /*+ parallel(t5 2) full(t5) */ count(*) from t5

I.e. 11.22 seconds in average per execution. However, as we can see above, the average execution time is around 5.7 seconds. The reason for the wrong elapsed time per execution is that the elapsed time for the parallel slaves is summed up to the elapsed time even though the processes worked in parallel. Thanks to the column SQL_EXEC_ID (very useful) we can get the sum of the elapsed times per execution from ASH:

SQL> break on report
SQL> compute avg of secs_db_time on report
SQL> select sql_exec_id, qc_session_id, qc_session_serial#, count(*) secs_db_time from v$active_session_history
  2  where sql_id='04r3647p2g7qu' and sample_time>to_date('14.11.2019 14:03:51','dd.mm.yyyy hh24:mi:ss')
  3  group by sql_exec_id, qc_session_id, qc_session_serial#
  4  order by 1;
 
SQL_EXEC_ID QC_SESSION_ID QC_SESSION_SERIAL# SECS_DB_TIME
----------- ------------- ------------------ ------------
   16777216	      237                  16626           12
   16777217	      237                  16626           12
   16777218	      237                  16626           10
   16777219	      237                  16626           12
   16777220	      237                  16626           10
   16777221	      237                  16626           10
                                             ------------
avg                                                    11
 
6 rows selected.

I.e. the 11 secs correspond to the 11.22 secs in the AWR-report.

How do we get the real elapsed time for the parallel queries? If the queries take a couple of seconds we can get the approximate time from ASH as well by subtracting the sample_time at the beginning from the sample_time at the end of each execution (SQL_EXEC_ID):

SQL> select sql_exec_id, extract (second from (max(sample_time)-min(sample_time))) secs_elapsed 
  2  from v$active_session_history
  3  where sql_id='04r3647p2g7qu'
  4  and sample_time>to_date('14.11.2019 14:03:51','dd.mm.yyyy hh24:mi:ss')
  5  group by sql_exec_id
  6  order by 1;
 
SQL_EXEC_ID SECS_ELAPSED
----------- ------------
   16777216         5.12
   16777217        5.104
   16777218         4.16
   16777219        5.118
   16777220        4.104
   16777221        4.171
 
6 rows selected.

I.e. those numbers reflect the real execution time much better.

REMARK: If the queries take minutes (or hours) to run then you have to extract the minutes (and hours) as well of course. See also the example I have at the end of the Blog.

The info in V$SQL is also not very helpful:

SQL> set lines 200 pages 999
SQL> select child_number, plan_hash_value, elapsed_time/1000000 elapsed_secs, 
  2  executions, px_servers_executions, last_active_time 
  3  from v$sql where sql_id='04r3647p2g7qu';
 
CHILD_NUMBER PLAN_HASH_VALUE ELAPSED_SECS EXECUTIONS PX_SERVERS_EXECUTIONS LAST_ACTIVE_TIME
------------ --------------- ------------ ---------- --------------------- -------------------
           0      2747857355    67.346941          6                    12 14.11.2019 14:05:17

I.e. for the QC we have the column executions > 0 and for the parallel slaves we have px_servers_executions > 0. You may actually get different child cursors for the Query Coordinator and the slaves.

So in theory we should be able to do something like:

SQL> select child_number, (sum(elapsed_time)/sum(executions))/1000000 elapsed_time_per_exec_secs 
  2  from v$sql where sql_id='04r3647p2g7qu' group by child_number;
 
CHILD_NUMBER ELAPSED_TIME_PER_EXEC_SECS
------------ --------------------------
           0                 11.2244902

Here we do see the number from the AWR again.

So in future be careful when checking the elapsed time per execution of statements, which ran with parallel slaves. The number will be too high in AWR or V$SQL. Further analysis to get the real elapsed time per execution would be necessary.

REMARK: As the numbers in AWR do come from e.g. dba_hist_sqlstat, the following query provides “wrong” output for parallel executions as well:

SQL> column begin_interval_time format a32
SQL> column end_interval_time format a32
SQL> select begin_interval_time, end_interval_time, ELAPSED_TIME_DELTA/1000000 elapsed_time_secs, 
  2  (ELAPSED_TIME_DELTA/EXECUTIONS_DELTA)/1000000 elapsed_per_exec_secs
  3  from dba_hist_snapshot snap, dba_hist_sqlstat sql 
  4  where snap.snap_id=sql.snap_id and sql_id='04r3647p2g7qu' 
  5  and snap.BEGIN_INTERVAL_TIME > to_date('14.11.2019 14:03:51','dd.mm.yyyy hh24:mi:ss');
 
BEGIN_INTERVAL_TIME              END_INTERVAL_TIME                ELAPSED_TIME_SECS ELAPSED_PER_EXEC_SECS
-------------------------------- -------------------------------- ----------------- ---------------------
14-NOV-19 02.04.00.176 PM        14-NOV-19 02.05.25.327 PM                67.346941            11.2244902

To take another example I did run a query from Jonathan Lewis from
https://jonathanlewis.wordpress.com/category/oracle/parallel-execution:

SQL> @jonathan
 
19348 rows selected.
 
Elapsed: 00:06:42.11

I.e. 402.11 seconds

AWR shows 500.79 seconds:

        Elapsed                  Elapsed Time
        Time (s)    Executions  per Exec (s)  %Total   %CPU    %IO    SQL Id
---------------- -------------- ------------- ------ ------ ------ -------------
           500.8              1        500.79   97.9   59.6   38.6 44v4ws3nzbnsd
Module: SQL*Plus
select /*+ parallel(t1 2) parallel(t2 2)
 leading(t1 t2) use_hash(t2) swa
p_join_inputs(t2) pq_distribute(t2 hash hash) ca
rdinality(t1,50000) */ t1.owner, t1.name, t1.typ

Let’s check ASH with the query I used above (this time including minutes):

select sql_exec_id, extract (minute from (max(sample_time)-min(sample_time))) minutes_elapsed,
extract (second from (max(sample_time)-min(sample_time))) secs_elapsed 
from v$active_session_history
where sql_id='44v4ws3nzbnsd'
group by sql_exec_id
order by 1;
 
SQL_EXEC_ID MINUTES_ELAPSED SECS_ELAPSED
----------- --------------- ------------
   16777216	              6       40.717

I.e. 06:40.72 which is close to the real elapsed time of 06:42.11

Cet article Elapsed time of Oracle Parallel Executions are not shown correctly in AWR est apparu en premier sur Blog dbi services.

A simple Terraform script to create an AWS EC2 playground

Mon, 2019-11-11 02:39

When I started to write the blog about AWS SSM I quickly realized that I need a way to bring up and destroy my AWS EC2 playground in an easy and reproducible way. There are several options for this: I could have used the AWS command line interface or AWS CloudFormation. Both work pretty well and would have been more than enough for my simple requirements. In the end I decided to use Terraform for one simple reason: With Terraform you can not only provision on AWS but also on Azure, GCP and many others. So, instead of using a tool which is limited to AWS, using a tool which is vendor independent make much more sense.

For the AWS SSM blog I had several requirements I wanted to address:

  • The playground should run in my on VPC so I will not affect any other people doing tests at the same time
  • Because I had Windows and Linux EC2 instances two security groups should get created, one allowing SSH and one allowing RDP into the EC2 machines
  • Both security groups should allow outbound connections to the internet by using an internet gateway
  • Finally two Windows, two Ubuntu, two Red Hat, two SLES and two CentOS instances should get provisioned

Using Terraform all of this is actually quite simple. The first information you’ll need to tell Terraform is the provider you want to use. In my case this is AWS and I am using my “test” AWS profile which is configured for my AWS command line interface, the AWS region I want to use is Frankfurt:

// set the provider to AWS and the AWS region to eu-central-1
provider "aws" {
  profile    = "test"
  region     = "eu-central-1"
}

Because I want to limit direct access to the EC2 instances to my own IP address for security reasons I am defining a local variable I am changing each time I am setting this up from a different network. Input variables could also be used for that but in my case a local variable is just fine:

locals {
  my_ip        = ["37.201.6.8/32"]
}

The first component that actually gets provisioned on AWS is the VPC usinf the aws_vpc resource::

// create the virtual private network
resource "aws_vpc" "dwe-vpc" {
  cidr_block = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support = true

  tags = {
    Name = "dwe-vpc"
  }
}

This defines the IP range I’ll be using, switches the dns parameters to on and gets a tag. As I want my EC2 instances to be able to connect to the internet an internet gateway gets provisioned in the next step, which is attached to the VPC that was created in the step before:

// create the internet gateway
resource "aws_internet_gateway" "dwe-igw" {
  vpc_id = "${aws_vpc.dwe-vpc.id}"

  tags = {
    Name = "dwe-igw"
  }
}

Next we need a subnet, again attached to the VPC:

// create a dedicated subnet
resource "aws_subnet" "dwe-subnet" {
  vpc_id            = "${aws_vpc.dwe-vpc.id}"
  cidr_block        = "10.0.1.0/24"
  availability_zone = "eu-central-1a"

  tags = {
    Name = "dwe-subnet"
  }
}

Routing tables are an important concept in AWS and define how traffic is routed in the VPC. The routing table below enables traffic to the internet through the internet gateway:

// create routing table which points to the internet gateway
resource "aws_route_table" "dwe-route" {
  vpc_id = "${aws_vpc.dwe-vpc.id}"

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = "${aws_internet_gateway.dwe-igw.id}"
  }

  tags = {
    Name = "dwe-igw"
  }
}

Once the routing table is defined it needs to be attached to the subnet:

// associate the routing table with the subnet
resource "aws_route_table_association" "subnet-association" {
  subnet_id      = "${aws_subnet.dwe-subnet.id}"
  route_table_id = "${aws_route_table.dwe-route.id}"
}

For ssh and rdp access to the EC2 instances two security groups get provisioned:

// create a security group for ssh access to the linux systems
resource "aws_security_group" "dwe-sg-ssh" {
  name        = "dwe-sg-ssh"
  description = "Allow SSH inbound traffic"
  vpc_id      = "${aws_vpc.dwe-vpc.id}"

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = local.my_ip
  }

  // allow access to the internet
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "dwe-sg-ssh"
  }
}

// create a security group for rdp access to the windows systems
resource "aws_security_group" "dwe-sg-rdp" {
name        = "dwe-sg-rdp"
vpc_id      = "${aws_vpc.dwe-vpc.id}"
description = "Allow RDP inbound traffic"

  ingress {
    from_port   = 3389
    to_port     = 3389
    protocol    = "tcp"
    cidr_blocks = local.my_ip
  }

  // allow access to the internet
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }


  tags = {
    Name = "dwe-sg-rdp"
  }
}

Both define an inbound rule (ssh and rdp) and an outbound rule for being able to connect the internet. Now, as all the basics are there the EC2 instances itself get provisioned based on the building blocks from above:

// create two Ubuntu instances
resource "aws_instance" "i-ubuntu-linux-prod" {
  ami                         = "ami-0cc0a36f626a4fdf5"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }
  
  tags = {
    Name = "i-ubuntu-linux-prod"
  }  
}

resource "aws_instance" "i-ubuntu-linux-test" {
  ami                         = "ami-0cc0a36f626a4fdf5"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-ubuntu-linux-test"
  } 
}

// create two Amazon linux instances
resource "aws_instance" "i-amazon-linux-prod" {
  ami                         = "ami-0f3a43fbf2d3899f7"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-amazon-linux-prod"
  } 
}

resource "aws_instance" "i-amazon-linux-test" {
  ami                         = "ami-0f3a43fbf2d3899f7"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-amazon-linux-test"
  } 
}

// create two Red Hat linux instances
resource "aws_instance" "i-redhat-linux-prod" {
  ami                         = "ami-0badcc5b522737046"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-redhat-linux-prod"
  } 
}

resource "aws_instance" "i-redhat-linux-test" {
  ami                         = "ami-0badcc5b522737046"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"
  
  root_block_device {     
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }
  
  tags = { 
    Name = "i-redhat-linux-test"
  }
}

// create two SUSE linux instances
resource "aws_instance" "i-suse-linux-prod" {
  ami                         = "ami-03b86a97a8f02d44e"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device { 
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = { 
    Name = "i-suse-linux-prod"
  }
}

resource "aws_instance" "i-suse-linux-test" {
  ami                         = "ami-03b86a97a8f02d44e"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"
  
  root_block_device {     
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }
  
  tags = { 
    Name = "i-suse-linux-test"
  }
}

// create two CentOS linux instances
resource "aws_instance" "i-centos-linux-prod" {
  ami                         = "ami-04cf43aca3e6f3de3"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device { 
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = { 
    Name = "i-centos-linux-prod"
  }
}

resource "aws_instance" "i-centos-linux-test" {
  ami                         = "ami-04cf43aca3e6f3de3"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"
  
  root_block_device {     
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }
  
  tags = { 
    Name = "i-centos-linux-test"
  }
}

// create two Windows instances
resource "aws_instance" "i-windows-prod" {
  ami           = "ami-034937fd7f621ba85"
  instance_type = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-rdp.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device { 
    volume_size           = "30"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = { 
    Name = "i-windows-prod"
  }
}

resource "aws_instance" "i-windows-test" {
  ami           = "ami-034937fd7f621ba85"
  instance_type = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-rdp.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device { 
    volume_size           = "30"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-windows-test"
  }
}

And that’s it. To check what actually will be done by Terraform there is the “plan” command. As the output is quite long I’ll skip most of it and just present the last few lines:

dwe@dwe:~/Documents/aws/ssm_demo$ terraform plan

...

Plan: 19 to add, 0 to change, 0 to destroy.

------------------------------------------------------------------------

Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.

Once you are happy with that you can “apply” the execution plan and everything will get provisioned and you confirmed by “yes”:

dwe@dwe:~/Documents/aws/ssm_demo$ terraform apply

Plan: 19 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

Sit back, relax and one or two minutes later your AWS playground is ready:

aws_vpc.dwe-vpc: Creating...
aws_vpc.dwe-vpc: Creation complete after 2s [id=vpc-026cdc481d5365074]
aws_internet_gateway.dwe-igw: Creating...
aws_subnet.dwe-subnet: Creating...
aws_security_group.dwe-sg-ssh: Creating...
aws_security_group.dwe-sg-rdp: Creating...
aws_subnet.dwe-subnet: Creation complete after 0s [id=subnet-028e27fef8df3b963]
aws_internet_gateway.dwe-igw: Creation complete after 0s [id=igw-0656108a04d5ea0a5]
aws_route_table.dwe-route: Creating...
aws_security_group.dwe-sg-rdp: Creation complete after 1s [id=sg-0764508e3a5234393]
aws_route_table.dwe-route: Creation complete after 1s [id=rtb-07691bc54b40af0ae]
aws_route_table_association.subnet-association: Creating...
aws_instance.i-windows-prod: Creating...
aws_instance.i-windows-test: Creating...
aws_security_group.dwe-sg-ssh: Creation complete after 1s [id=sg-053995952f558a4ff]
aws_instance.i-centos-linux-test: Creating...
aws_instance.i-amazon-linux-prod: Creating...
aws_instance.i-amazon-linux-test: Creating...
aws_instance.i-redhat-linux-prod: Creating...
aws_instance.i-centos-linux-prod: Creating...
aws_instance.i-redhat-linux-test: Creating...
aws_instance.i-ubuntu-linux-test: Creating...
aws_route_table_association.subnet-association: Creation complete after 0s [id=rtbassoc-07c7d4282033c4a71]
aws_instance.i-ubuntu-linux-prod: Creating...
aws_instance.i-windows-prod: Still creating... [10s elapsed]
aws_instance.i-windows-test: Still creating... [10s elapsed]
aws_instance.i-centos-linux-test: Still creating... [10s elapsed]
aws_instance.i-amazon-linux-test: Still creating... [10s elapsed]
aws_instance.i-amazon-linux-prod: Still creating... [10s elapsed]
aws_instance.i-redhat-linux-prod: Still creating... [10s elapsed]
aws_instance.i-centos-linux-prod: Still creating... [10s elapsed]
aws_instance.i-redhat-linux-test: Still creating... [10s elapsed]
aws_instance.i-ubuntu-linux-test: Still creating... [10s elapsed]
aws_instance.i-ubuntu-linux-prod: Still creating... [10s elapsed]
aws_instance.i-amazon-linux-test: Creation complete after 13s [id=i-02706717c3440723a]
aws_instance.i-suse-linux-test: Creating...
aws_instance.i-ubuntu-linux-test: Creation complete after 16s [id=i-0d2999aa319a90a3d]
aws_instance.i-centos-linux-test: Creation complete after 16s [id=i-03923fcf9d5881421]
aws_instance.i-suse-linux-prod: Creating...
aws_instance.i-ubuntu-linux-prod: Creation complete after 16s [id=i-00967725bc758f3ef]
aws_instance.i-redhat-linux-test: Creation complete after 16s [id=i-02a705327fb0acb61]
aws_instance.i-windows-prod: Creation complete after 16s [id=i-09c3fcc90491ef2cf]
aws_instance.i-redhat-linux-prod: Creation complete after 16s [id=i-0161726dfe1ed890b]
aws_instance.i-windows-test: Creation complete after 16s [id=i-02f567d11d32444fd]
aws_instance.i-amazon-linux-prod: Still creating... [20s elapsed]
aws_instance.i-centos-linux-prod: Still creating... [20s elapsed]
aws_instance.i-suse-linux-test: Still creating... [10s elapsed]
aws_instance.i-amazon-linux-prod: Creation complete after 23s [id=i-0b799879e77ce8b33]
aws_instance.i-suse-linux-prod: Still creating... [10s elapsed]
aws_instance.i-centos-linux-prod: Still creating... [30s elapsed]
aws_instance.i-centos-linux-prod: Creation complete after 32s [id=i-0482d958849f86483]
aws_instance.i-suse-linux-test: Still creating... [20s elapsed]
aws_instance.i-suse-linux-prod: Still creating... [20s elapsed]
aws_instance.i-suse-linux-test: Still creating... [30s elapsed]
aws_instance.i-suse-linux-test: Creation complete after 32s [id=i-0b35d559853f9f0d6]
aws_instance.i-suse-linux-prod: Still creating... [30s elapsed]
aws_instance.i-suse-linux-prod: Creation complete after 33s [id=i-062df970b894a23da]

Apply complete! Resources: 19 added, 0 changed, 0 destroyed.

Once you’re done with your tests, simply destroy the whole stuff by usinf the “destroy” command:

dwe@dwe:~/Documents/aws/ssm_demo$ terraform destroy
...
Plan: 0 to add, 0 to change, 19 to destroy.

Do you really want to destroy all resources?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes
...
aws_security_group.dwe-sg-rdp: Destroying... [id=sg-0764508e3a5234393]
aws_security_group.dwe-sg-rdp: Destruction complete after 1s
aws_security_group.dwe-sg-ssh: Destruction complete after 1s
aws_subnet.dwe-subnet: Destruction complete after 1s
aws_vpc.dwe-vpc: Destroying... [id=vpc-026cdc481d5365074]
aws_vpc.dwe-vpc: Destruction complete after 0s

Destroy complete! Resources: 19 destroyed.

Quite easy, always reproducible and fast.

Cet article A simple Terraform script to create an AWS EC2 playground est apparu en premier sur Blog dbi services.

odacli create-database error DCS-10802: Insufficient disk space on file system: database

Mon, 2019-11-11 02:27
Introduction

I was reimaging an X6-2M ODA after various patching troubles, and everything was fine. After several databases created, the next ones could no more be created.

DCS-10802: Insufficient disk space on file system: database. Expected free space (MB): {1}, available space (MB): {2}

I’ve spent some time on it, and finally found the cause of the problem. And the solution.

Context

After successfully reimaged an X6-2M ODA with 18.5, and applied the patch for the firmwares, ILOM and disks, I was creating all the databases with odacli with the following commands:


odacli create-database -hm XyXyXyXyXyXy --dbstorage ACFS --characterset WE8MSWIN1252 --databaseUniqueName HPMVRN --dbhomeid '0704ef7c-0cb9-4525-8edb-8d70b8f7ddfb' --dblanguage AMERICAN --dbname HPMVRN --dbshape odb1s --dbterritory AMERICA --no-cdb --no-dbconsole --json
odacli create-database ...

Each database is created with the smallest shape odb1s, as I later fine tune each instance according to my needs.

After the 8th or 9th database created, the next ones ended with a failure:

odacli describe-job -i "2ad9f7e8-331d-4a82-bb7b-c88ddad1cdf8"

Job details
----------------------------------------------------------------
ID: 2ad9f7e8-331d-4a82-bb7b-c88ddad1cdf8
Description: Database service creation with db name: HPMVRN
Status: Failure
Created: November 8, 2019 1:33:23 PM CET
Message: DCS-10802:Insufficient disk space on file system: database. Expected free space (MB): {1}, available space (MB): {2}

Task Name Start Time End Time Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Database Service creation November 8, 2019 1:33:23 PM CET November 8, 2019 1:33:58 PM CET Failure
Database Service creation November 8, 2019 1:33:23 PM CET November 8, 2019 1:33:58 PM CET Failure
Setting up ssh equivalance November 8, 2019 1:33:23 PM CET November 8, 2019 1:33:23 PM CET Success
Creating volume datHPMVRN November 8, 2019 1:33:23 PM CET November 8, 2019 1:33:42 PM CET Success
Creating ACFS filesystem for DATA November 8, 2019 1:33:42 PM CET November 8, 2019 1:33:57 PM CET Success
Database Service creation November 8, 2019 1:33:57 PM CET November 8, 2019 1:33:58 PM CET Failure
Database Creation November 8, 2019 1:33:57 PM CET November 8, 2019 1:33:58 PM CET Failure

Analysis

Error seems obvious: Insufficient disk space. Let’s check the disk space:


df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolRoot
30G 5.6G 23G 20% /
tmpfs 126G 1.1G 125G 1% /dev/shm
/dev/sda1 477M 41M 411M 9% /boot
/dev/mapper/VolGroupSys-LogVolOpt
109G 85G 19G 82% /opt
/dev/mapper/VolGroupSys-LogVolU01
148G 31G 110G 22% /u01
/dev/asm/datdbtest-8 100G 2.2G 98G 3% /u02/app/oracle/oradata/DBTEST
/dev/asm/datgotst-8
100G 2.2G 98G 3% /u02/app/oracle/oradata/GOTST
/dev/asm/datgeval-8
100G 2.2G 98G 3% /u02/app/oracle/oradata/GEVAL
/dev/asm/commonstore-8
5.0G 391M 4.7G 8% /opt/oracle/dcs/commonstore
/dev/asm/datsmval-8
100G 44G 57G 44% /u02/app/oracle/oradata/SMVAL
/dev/asm/datvival-8
100G 34G 67G 34% /u02/app/oracle/oradata/VIVAL
/dev/asm/datvjval-8
100G 56G 45G 56% /u02/app/oracle/oradata/VJVAL
/dev/asm/dump-8 200G 132G 69G 66% /dpdumplocal
/dev/asm/dataoval-8
100G 2.2G 98G 3% /u02/app/oracle/oradata/AOVAL
/dev/asm/datgrtst-8 100G 2.2G 98G 3% /u02/app/oracle/oradata/GRTST
/dev/asm/datgival-8
100G 7.8G 93G 8% /u02/app/oracle/oradata/GIVAL
/dev/asm/reco-329 74G 56G 19G 76% /u03/app/oracle
/dev/asm/datgetst-8 100G 2.2G 98G 3% /u02/app/oracle/oradata/GETST
/dev/asm/datgftst-8
100G 30G 71G 30% /u02/app/oracle/oradata/GFTST
/dev/asm/datgctst-8
100G 2.2G 98G 3% /u02/app/oracle/oradata/GCTST
/dev/asm/dathpmvrn-8 100G 448M 100G 1% /u02/app/oracle/oradata/HPMVRN

No filesystem is full. And create-database managed to create the acfs volume for data successfully. Let’s try to put something in it:


cp /the_path/the_big_file /u02/app/oracle/oradata/HPMVRN/

No problem with this acfs volume.

Let’s try to create the database into ASM:


odacli list-databases | tail -n 1

bed7f9a0-e108-4423-8b2c-d7c33c795e87 HPMVRN Si 12.1.0.2 false Oltp Odb1s Acfs Failed 0704ef7c-0cb9-4525-8edb-8d70b8f7ddfb

odacli delete-database -i "bed7f9a0-e108-4423-8b2c-d7c33c795e87"

odacli create-database -hm XyXyXyXyXyXy --dbstorage ASM --characterset WE8MSWIN1252 --databaseUniqueName HPMVRN --dbhomeid '0704ef7c-0cb9-4525-8edb-8d70b8f7ddfb' --dblanguage AMERICAN --dbname HPMVRN --dbshape odb1s --dbterritory AMERICA --no-cdb --no-dbconsole --json

No problem here, really seems to be related to acfs.

Where does the create-database also need free space? For sure, in the RECOvery area filesystem, created with the very first database:


Filesystem Size Used Avail Use% Mounted on
/dev/asm/reco-329 74G 56G 19G 76% /u03/app/oracle

Let’s create a file in this filesystem:

cp /the_path/the_big_file /u03/app/oracle/

No problem.

Quite strange, and as the database is not yet created, there is no alert_HPMVRN.log to look for the error…

Maybe the RECOvery area filesystem is not big enough for odacli. acfs concept means autoextensible filesystems, but as all my Fast Recovery Areas of all my databases probably won’t fit in the allocated 74GB, odacli may fail. How to extend the acfs RECOvery filesystem? Just put enough dummy files in it, and then remove them.


cd /u03/app/oracle
cp /the_path/the_big_file tmpfile1
cp tmpfile1 tmpfile2
cp tmpfile1 tmpfile3
cp tmpfile1 tmpfile4
cp tmpfile1 tmpfile5
cp tmpfile1 tmpfile6
rm -rf tmp*

df -h /u03/app/oracle
Filesystem Size Used Avail Use% Mounted on
/dev/asm/reco-329 88G 59G 30G 67% /u03/app/oracle

RECOvery filesystem is slightly bigger, let’s retry the database creation:


odacli delete-database -i "985b1d37-6f84-4d64-884f-3a429c195a5d"

odacli create-database -hm XyXyXyXyXyXy --dbstorage ACFS --characterset WE8MSWIN1252 --databaseUniqueName HPMVRN --dbhomeid '0704ef7c-0cb9-4525-8edb-8d70b8f7ddfb' --dblanguage AMERICAN --dbname HPMVRN --dbshape odb1s --dbterritory AMERICA --no-cdb --no-dbconsole --json

odacli describe-job -i "d5022d8b-9ddb-4f93-a84a-36477657794f"

Job details
----------------------------------------------------------------
ID: d5022d8b-9ddb-4f93-a84a-36477657794f
Description: Database service creation with db name: HPMVRN
Status: Success
Created: November 8, 2019 1:47:28 PM CET
Message:

Task Name Start Time End Time Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Setting up ssh equivalance November 8, 2019 1:47:30 PM CET November 8, 2019 1:47:40 PM CET Success
Creating volume datHPMVRN November 8, 2019 1:47:40 PM CET November 8, 2019 1:47:59 PM CET Success
Creating ACFS filesystem for DATA November 8, 2019 1:47:59 PM CET November 8, 2019 1:48:14 PM CET Success
Database Service creation November 8, 2019 1:48:14 PM CET November 8, 2019 1:54:38 PM CET Success
Database Creation November 8, 2019 1:48:14 PM CET November 8, 2019 1:53:02 PM CET Success
Change permission for xdb wallet files November 8, 2019 1:53:02 PM CET November 8, 2019 1:53:02 PM CET Success
Place SnapshotCtrlFile in sharedLoc November 8, 2019 1:53:02 PM CET November 8, 2019 1:53:04 PM CET Success
SqlPatch upgrade November 8, 2019 1:54:02 PM CET November 8, 2019 1:54:35 PM CET Success
updating the Database version November 8, 2019 1:54:35 PM CET November 8, 2019 1:54:38 PM CET Success
create Users tablespace November 8, 2019 1:54:38 PM CET November 8, 2019 1:54:40 PM CET Success

This is it. I was then able to create the next databases without any problem.

Final words

Keep in mind that all your Fast Recovery Areas should fit in your dedicated filesystem. To make sure that you will not encounter problems, resize you RECOvery filesystem to the sum of all your target FRA with acfsutil:

acfsutil size 500G –d /dev/asm/reco-329 /u03/app/oracle

Autoextensible filesystem never ensures you it will succeed in case of extension.

Or simply go for ASM instead of acfs on your ODA, if you do not need acfs features like snapshots. ASM is simplier and more efficient because it does not provision disk space for each database like acfs.

Cet article odacli create-database error DCS-10802: Insufficient disk space on file system: database est apparu en premier sur Blog dbi services.

Avoiding patching madness by using AWS SSM

Fri, 2019-11-08 13:15

As we have more and more customers either currently moving to AWS or already moved much, if not even all, of their workloads to AWS, one question pops up quite frequently: Now that we have so many EC2 instances up and running how do we regularly patch them? There are many tools around for patching operating systems, such as Microsoft SSCM, Red Hat Satellite or SUSE Manager just to name a few. There are many other as well but most of them have an important drawback: Either they can only patch Windows systems or they can only patch Linux systems (and even here usually only a subset of the distributions currently in use). This is where AWS System Manager becomes very handy as you can patch most of the commonly used operating system using one single tool. You can even patch your on-prem instances if your network setup includes the on-prem instances as well. In this post I’ll share what we did for one of our customers to remove the pain of either using multiple tools to patch all the operating systems or to manually patch all the systems and then (manually) document what has been done and what was the result.

Managed Instances

Once you start with AWS SSM there are a few key concepts you need to understand. The first one is the so-called “Managed Instance”. For an instance being managed it must fulfill two requirements:

  1. The AWS agent must be deployed on the instance
  2. An IAM role with the correct permission must be assigned to the instance

For the AWS agent it is quite easy if you are deployed any of these because the agent will be there be default:

  • Windows Server 2003-2012 R2 AMIs published in November 2016 or later
  • Amazon Linux
  • Amazon Linux 2
  • Ubuntu Server 16.04
  • Ubuntu Server 18.04

If you are running something else the agent needs to be installed manually as described here for Windows and here for Linux. For on-prem instances it is straight forward as well (Windows here, Linux here) but you need to create Managed Instance Actications as well which is not in the scope of this post.

For the IAM role we’ll first look at what I have currently deployed:

This are two Windows, two Red Hat, two Ubuntu, two Amazon Linux, two CentOS and two SUSE instances, one for Prod and one for Test each. The “Prod” and “Test” naming becomes important later, just ignore it for now. If we take a look at how many managed instances we have right now in System Manager we will see zero, even that the Windows, Ubuntu and Amazon Linux instances have the SSM agent deployed automatically (otherwise we would see a list of managed instances here):

We will use the “Quick Setup” here to setup the IAM role and instance profiles:

For the scope of this post we will go with defaults suggested by AWS:

We want all the instances:


The last screen shows an overview of what is currently happening, basically collecting all the information that will be stored in the AWS SSM inventory. Before you’ll see any instances in “Managed Instances” you will need to wait until the pending inventory actions have been completed:

Once that is done all the managed instances should be there (notice that I manually deployed the ASM SSM agent on the SUSE, CentOS and RedHat instances, otherwise they would not show up here):

Patch baselines

Once all instances are “managed” there is the next concept we need to understand: Patch baselines. Patch baselines basically define which set of patches (classification, severity, …) shall get applied to which product (e.g. Red Hat 7, Red Hat 8, …). As we have six different operating system we need at least six patch baselines. In reality you’ll probably even need twelve because you will have production and test systems. Usually you want to have the same patches applied to production as you have applied to test some days or weeks before and therefore we need two patch baselines per operating system flavor. Patch baselines are listed in SSM under “Patch manager”:

The following screenshots show how I created the patch baseline for the Red Hat test systems:

For the Red Hat production systems the patch baseline is exactly the same but I’ve put an approval delay of 14 days:

Why the delay? Suppose you patch your test systems every first Tuesday a month and then you patch the production systems 14 days later. As you want to have the same patches on production and test, the production patches get an approval delay of 14 days. This is how you can make sure that your systems are on the same level even if production is patched later than test. One point consider is, that this does not apply to the Ubuntu systems as there is no approval delay for Ubuntu patches:

Once all the base lines have been created it looks like this:

The “Patch group” tag

The next key concept that comes into the game are tags. No matter what you well be doing in the cloud, without tags you’re lost. There is just no way around tags if you want to automate processes in the cloud. When it comes the AWS SSM there is the “Patch Group” tag and this tag needs to be assigned to each instance you want to have patched automatically. Let’s assume you want to have the test instances patched every first Saturday each month and the production instances every 3rd Saturday each month ( 6 a.m. in the morning for both). In addition all the Linux hosts should be patched before the Windows hosts, because the Linux hosts run the databases and the Windows hosts run the application servers or the application (of for whatever reason there is a dependency between the hosts and you need to follow a defined order for patching).

What we did to fulfill this requirement is to populate the “Patch Group” tag like this:

  1. All test instances get the value: patch_test_0600_batch_1 or patch_test_0600_batch_2
  2. All production instances get the value: patch_prod_0600_batch_1 or patch_prod_0600_batch_2

The “*batch” values builds the dependency between the Linux and the Windows host. Taking one test Linux and one test Windows as an example the tag would look like this:

I’ve assigned the tags to all the remaining instances accordingly:

  1. i-amazon-linux-test: patch_test_0600_batch_1
  2. i-centos-linux-test: patch_test_0600_batch_1
  3. i-redhat-linux-test: patch_test_0600_batch_1
  4. i-suse-linux-test: patch_test_0600_batch_1
  5. i-ubuntu-linux-test: patch_test_0600_batch_1
  6. i-windows-test: patch_test_0600_batch_2

Same procedure for the production instances but using the prod_* tags of course. Only assigning these tags to the instances is not enough though. We need to assign them to the patch baselines as well so the baseline gets selected when a patch is supposed to run on the systems. Assigning a “Patch Group” to a patch baselines basically links the patch baseline to instances with the same tag value per operating system. If you don’t do that you will always get the default patch baseline for an operating system and you have no chance of applying the same patches to test and prod because you can not have different approval delays.

This needs to be done for all the patch baselines we created above. The patch_test_0600_batch* tags get attached to the baselines for test and the patch_prod_0600_batch* tags get attached to the baselines for production.

Maintenance windows

The last building block are the maintenance windows. They define when an action will executed on which systems. You can not only use them for patching but for many other tasks as well (e.g. running a PowerShell script on Windows hosts, running a shell script on Linux hosts, running an Ansible playbook and much more).

As we want to schedule patching for all the test instances the first Saturday of a month and all production systems the third Saturday of the month we’ll need to two maintenance windows, one for test and for production. We’ll start with the one for test:

For the scheduling you should go for a cron expression:

“6F” means the first Saturday (0 is Sunday) each month. Note that if you skip the “Schedule timezone” timezone you will need to provide UTC time in the cron expression. If you do specify your timezone, AWS is automatically calculating the correct UTC time for you, as you can see once the maintenance window is created:

Now we have the scheduling but the maintenance window has no idea what to run and against what to run it. What to run is specified by adding targets:

You should definitely give a name to the target so you can easily identify the target afterwards:

By specifying the “Patch Group” instance tag with our values for the test systems that should run in the first batch we have all the Linux systems registered with this maintenance window. The second target will be the Windows systems that should be patched once the Linux systems have been done:

All test systems registered:

Now that we defined where to run we need to define what to run and this is specified by the “Run command”:

In our case the document is “AWS-RunPatchBaseline” but it could be any of the documents listed as mentioned earlier:

The task priority is 1 (because we want this to be executed first) and the target is “patch_test_0600_batch_1” (this is why the name for target is important. You will probably not remember the unique string that was generated when you registered the target):

The rate control option allows you to control concurrenry and error thresholds, for the IAM role go with the defaults:

We highly recommend to store the output in a S3 bucket, otherwise the output of the run command will be truncated after 2000 charcters. The parameter should be “Install” (you could also go for “Scan” if you only want to scan for available patches):

Do exactly the same for the patch_test_0600_batch_2 group, with priority 2:

Both run commands are now registered:

And that’s it. Repeat the same for the production machines and all your instances are scheduled for patching. Once the maintenance window executed you can get the logs in the “Run command” secion:



Here you have the link to the log in S3:

Conclusion

When you start using AWS SSM it is a bit hard to understand all the relations between patch baselines, the patch group tag, target, run commands and maintenance windows. Once you got that it is quite easy to schedule patching of your whole infratructure (even on prem) when you run operation systems AWS has patch baselines for. The logs should definitely go to S3 so you have the full output available.

Cet article Avoiding patching madness by using AWS SSM est apparu en premier sur Blog dbi services.

Connection pooling with PgBouncer

Thu, 2019-11-07 09:38

Some of you may know the case: As soon as the number of users grow, the number of resource problems increases. Have you ever thought about using a connection pooler? Too complex, too much administration effort? In this post I like to explain, how the connection pooler can help you with your memory, as well as showing you the simplicity of setup connection pooling with PgBouncer.

Introduction

By default PostgreSQL forks it’s main process to child processes for every new connection. In the course of time this can lead to more and more processes on the server. On one hand, this is pretty cool, because it can provide more stability and a nice view of resource utilization per connection. But if there are many short time connections, the disadvantages will predominate. The more connections you have, the more RAM will be used.
The solution for that problem can be a connection pooler like PgBouncer, an opensource connection pooling middleware espacially designed for Postgres. It will act like a wrapper around the database connections. It has the internals for the connection between the database and the pool, but everything is hidden from the application that connects.

Installation

For the installation of pgBouncer you can decide between two possibilities
1. using yum
2. building from git (https://pgbouncer.github.io/install.html#building-from-git)

To keep it simple, we go on with method 1.

postgres@centos-mini:/u02/pgdata/11/PG1/ [PG1] sudo yum install pgbouncer
Loaded plugins: fastestmirror
Determining fastest mirrors
epel/x86_64/metalink                                                                |  28 kB  00:00:00
 * base: pkg.adfinis-sygroup.ch
 * epel: pkg.adfinis-sygroup.ch
 * extras: pkg.adfinis-sygroup.ch
 * updates: pkg.adfinis-sygroup.ch
base                                                                                | 3.6 kB  00:00:00
epel                                                                                | 5.3 kB  00:00:00
extras                                                                              | 2.9 kB  00:00:00
ius                                                                                 | 1.3 kB  00:00:00
pgdg10                                                                              | 3.6 kB  00:00:00
pgdg11                                                                              | 3.6 kB  00:00:00
updates                                                                             | 2.9 kB  00:00:00
(1/10): base/7/x86_64/group_gz                                                      | 165 kB  00:00:06
(2/10): extras/7/x86_64/primary_db                                                  | 153 kB  00:00:00
(3/10): epel/x86_64/group_gz                                                        |  90 kB  00:00:06
(4/10): epel/x86_64/primary_db                                                      | 6.9 MB  00:00:08
(5/10): epel/x86_64/updateinfo                                                      | 1.0 MB  00:00:08
(6/10): pgdg11/7/x86_64/primary_db                                                  | 337 kB  00:00:01
(8/10): base/7/x86_64/primary_db                                                    | 6.0 MB  00:00:10
(10/10): updates/7/x86_64/primary_db                                                | 2.8 MB  00:00:01
(11/10): ius/x86_64/primary                                                         | 139 kB  00:00:06
(12/10): pgdg10/7/x86_64/primary_db                                                 | 384 kB  00:00:06
ius                                                                                              631/631
Resolving Dependencies
--> Running transaction check
---> Package pgbouncer.x86_64 0:1.12.0-1.rhel7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

===========================================================================================================================================================================================================
 Package                           Arch                                               Version                                                     Repository                                          Size
===========================================================================================================================================================================================================
Installing:
 pgbouncer                         x86_64                                             1.12.0-1.rhel7                                              pgdg10                                             207 k

Transaction Summary
===========================================================================================================================================================================================================
Install  1 Package

Total download size: 207 k
Installed size: 477 k
Is this ok [y/d/N]: y
Downloading packages:
pgbouncer-1.12.0-1.rhel7.x86_64.rpm                                                                                                                                                | 207 kB  00:00:06
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : pgbouncer-1.12.0-1.rhel7.x86_64                                                                                                                                                     1/1
  Verifying  : pgbouncer-1.12.0-1.rhel7.x86_64                                                                                                                                                     1/1

Installed:
  pgbouncer.x86_64 0:1.12.0-1.rhel7

Complete!
Configuration pgbouncer.ini

PgBouncer has one central congfiguration file called pgbouncer.ini, per default it is located under /etc/pgbouncer and it is used to configure the PgBouncer pool.
You can define a lot of parameters in this file, most of them commented out by default and you can start with a very minimal configuration. It includes generic settings as logile, listen_addr, listen_port… as well as connectivity, console access control, admin_users and many, many more. The full list of parameters can be found in the pgbouncer documentation, which is really detailled and gives you a good overview.
Lets have a look to a easy sample of pgbouncer.ini file

cat /etc/pgbouncer/pgbouncer.ini
[databases]
bouncer= host=localhost dbname=bouncer

[pgbouncer]
listen_addr=127.0.0.1
listen_port= 6432
auth_type = md5
auth_file= /etc/pgbouncer/userlist.txt
admin_users=bounce
pool_mode=session
max_client_conn = 100
default_pool_size = 20
logfile = /etc/pgbouncer/log/pgbouncer.log
pidfile = /etc/pgbouncer/pgbouncer.pid

The ini-file has two sections, first the [database] section defines the alias(es) for the database(s). Just start a new line for every database. You can also define user and port etc. and afterwards the [pgbouncer] section, where the pool configuration is done. You can also start with a very minimal configuration.

One of the most important parameters is pool_mode, which defines how a server connection can be reused. 3 modes can be defined:
session: This is the default value: Connections are released back to the pool after disconnection.
transaction: Releases the connection to the pool once a transaction finishes.
statement: After a query finishes, the connection is released back to he pool.

The other parameters in section pgbouncer shortly explained:
listen_addr: List of addresses where to listen for TCP connection.
listen_port: Listening port
admin_users: Users from the auth_file which get access to a special pgbouncer database. This database provides performance-related information about PgBouncer.
max_client_conn: This is the maximum number of client connections allowed. The default value is 100, but there is also a formula to calculate the value
default_pool_size: The number of server connections allowed per user/database pair. The default value is 20.
logfile: This one is self-explaining. The log file location.
pidfile: The location of the PID file.
auth_type and auth_file: This two belong together. Auth_type specifies how to authenticate users (pam|hba|md5) against PgBouncer and auth_file contains the usernames and passwords.

userlist.txt

As already mentioned this file has a very simple structure, username and password. You can either write the password in plain text or the MD5 has of the password. Of course it is not very secure to put the plain text password in here.

cat /etc/pgbouncer/userlist.txt
"bounce" "md51db1c086e81505132d1834e06d07420e"
"postgres" "md53175bce1d3201d16594cebf9d7eb3f9d"
Start PgBouncer

Now all the configuration is done and PgBouncer can be started. It is possible to start PgBouncer on command line and you see the log output directly:

postgres@centos-mini:/etc/pgbouncer/ [PG1] /bin/pgbouncer pgbouncer.ini
2019-11-06 19:40:05.862 CET [13498] LOG kernel file descriptor limit: 1024 (hard: 4096); max_client_conn: 100, max expected fd use: 192
2019-11-06 19:40:05.864 CET [13498] LOG listening on 127.0.0.1:16432
2019-11-06 19:40:05.864 CET [13498] LOG listening on unix:/tmp/.s.PGSQL.16432
2019-11-06 19:40:05.864 CET [13498] LOG process up: PgBouncer 1.12.0, libevent 2.0.21-stable (epoll), adns: libc-2.17, tls: OpenSSL 1.0.2k-fips  26 Jan 2017
2019-11-06 19:41:05.868 CET [13498] LOG stats: 0 xacts/s, 0 queries/s, in 0 B/s, out 0 B/s, xact 0 us, query 0 us, wait 0 us
2019-11-06 19:41:39.325 CET [13498] LOG C-0x18da360: db1/user1@127.0.0.1:58648 login attempt: db=db1 user=user1 tls=no
2019-11-06 19:41:39.326 CET [13498] LOG C-0x18da360: db1/user1@127.0.0.1:58648 closing because: client unexpected eof (age=0s)
2019-11-06 19:41:47.577 CET [13498] LOG C-0x18da360: db1/user1@127.0.0.1:58652 login attempt: db=db1 user=user1 tls=no
2019-11-06 19:41:47.579 CET [13498] LOG S-0x18e0c30: db1/user1@[::1]:5432 new connection to server (from [::1]:37654)
2019-11-06 19:42:05.869 CET [13498] LOG stats: 0 xacts/s, 0 queries/s, in 0 B/s, out 0 B/s, xact 0 us, query 0 us, wait 157 us
2019-11-06 19:43:05.872 CET [13498] LOG stats: 0 xacts/s, 0 queries/s, in 0 B/s, out 2 B/s, xact 1522 us, query 1522 us, wait 0 us

Furthermore it is possible to create a service which starts automatically in the background after every reboot:

cat /etc/systemd/system/pgbouncer.service
[Unit]
Description=A lightweight connection pooler for PostgreSQL
After=syslog.target
After=network.target

[Service]
Type=simple

User=postgres
Group=postgres

# Path to the init file
Environment=BOUNCERCONF=/etc/pgbouncer/pgbouncer.ini

PIDFile=/var/run/pgbouncer/pgbouncer.pid

# Where to send early-startup messages from the server
# This is normally controlled by the global default set by systemd
# StandardOutput=syslog

ExecStart=/bin/pgbouncer ${BOUNCERCONF}
ExecReload=/bin/kill -HUP $MAINPID
KillSignal=SIGINT

# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec=300

[Install]
WantedBy=multi-user.target
postgres@centos-mini:/etc/ [PG1] sudo systemctl start pgbouncer
postgres@centos-mini:/etc/ [PG1] sudo systemctl status pgbouncer
● pgbouncer.service - A lightweight connection pooler for PostgreSQL
   Loaded: loaded (/etc/systemd/system/pgbouncer.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-11-07 15:17:09 CET; 4s ago
 Main PID: 17298 (pgbouncer)
   CGroup: /system.slice/pgbouncer.service
           └─17298 /bin/pgbouncer /etc/pgbouncer/pgbouncer.ini

Nov 07 15:17:09 centos-mini systemd[1]: Started A lightweight connection pooler for PostgreSQL.
Nov 07 15:17:09 centos-mini pgbouncer[17298]: 2019-11-07 15:17:09.127 CET [17298] LOG kernel file descriptor limit: 1024 (hard: 4096); max_client_conn: 100, max expected fd use: 172
Nov 07 15:17:09 centos-mini pgbouncer[17298]: 2019-11-07 15:17:09.127 CET [17298] LOG listening on 127.0.0.1:6432
Nov 07 15:17:09 centos-mini pgbouncer[17298]: 2019-11-07 15:17:09.127 CET [17298] LOG listening on unix:/tmp/.s.PGSQL.6432
Nov 07 15:17:09 centos-mini pgbouncer[17298]: 2019-11-07 15:17:09.127 CET [17298] LOG process up: PgBouncer 1.12.0, libevent 2.0.21-stable (epoll), adns: libc-2.17, tls: OpenSSL 1.0.2k-fips  26 Jan 2017
15:17:13 postgres@centos-mini:/etc/ [PG1]

PgBouncer is running now and you can connect to the database using PgBouncer.

postgres@centos-mini:/etc/ [PG1] psql -U bounce -p 6432 -h localhost bouncer
Password for user bounce:
psql (11.4 dbi services build)
Type "help" for help.
bouncer=>
bouncer=>
Monitoring

I already mentioned the admin users before and want to explain them a little bit more detailed now.
PgBouncer allows users with admin rights to connect to the virtual database “pgbouncer”. You can use this database to see who is connecting, how many active connections are in a pool and of course many more. As a good starting point, use “SHOW HELP” as soon as you are connected.

postgres@centos-mini:/etc/ [PG1] psql -U bounce -p 6432 -h localhost pgbouncer
Password for user bounce:
psql (11.4 dbi services build, server 1.12.0/bouncer)
Type "help" for help.
pgbouncer=# SHOW HELP;
NOTICE:  Console usage
DETAIL:
        SHOW HELP|CONFIG|DATABASES|POOLS|CLIENTS|SERVERS|USERS|VERSION
        SHOW FDS|SOCKETS|ACTIVE_SOCKETS|LISTS|MEM
        SHOW DNS_HOSTS|DNS_ZONES
        SHOW STATS|STATS_TOTALS|STATS_AVERAGES|TOTALS
        SET key = arg
        RELOAD
        PAUSE []
        RESUME []
        DISABLE 
        ENABLE 
        RECONNECT []
        KILL 
        SUSPEND
        SHUTDOWN
SHOW
pgbouncer=#
pgbouncer=# SHOW POOLS;
 database  |   user    | cl_active | cl_waiting | sv_active | sv_idle | sv_used | sv_tested | sv_login | maxwait | maxwait_us | pool_mode
-----------+-----------+-----------+------------+-----------+---------+---------+-----------+----------+---------+------------+-----------
 bouncer   | bounce    |         2 |          0 |         0 |       0 |       1 |         0 |        0 |       0 |          0 | session
 pgbouncer | pgbouncer |         1 |          0 |         0 |       0 |       0 |         0 |        0 |       0 |          0 | statement
(2 rows)

pgbouncer=# SHOW CLIENTS;
 type |  user  | database  | state  |   addr    | port  | local_addr | local_port |      connect_time       |      request_time       | wait | wait_us | close_needed |    ptr    | link | remote_pid | tls
------+--------+-----------+--------+-----------+-------+------------+------------+-------------------------+-------------------------+------+---------+--------------+-----------+------+------------+-----
 C    | bounce | bouncer   | active | 127.0.0.1 | 40322 | 127.0.0.1  |       6432 | 2019-11-07 15:24:40 CET | 2019-11-07 15:24:40 CET |    0 |       0 |            0 | 0x1bd9598 |      |          0 |
 C    | bounce | bouncer   | active | 127.0.0.1 | 40332 | 127.0.0.1  |       6432 | 2019-11-07 15:25:12 CET | 2019-11-07 15:25:12 CET |    0 |       0 |            0 | 0x1bd97b0 |      |          0 |
 C    | bounce | pgbouncer | active | 127.0.0.1 | 40314 | 127.0.0.1  |       6432 | 2019-11-07 15:21:50 CET | 2019-11-07 15:25:36 CET |  221 |  440169 |            0 | 0x1bd9380 |      |          0 |
(3 rows)
Conclusion

The above configuration is only a simple example how the configuration can look like. Of course there are many more specifics you can define. PgBouncer is a great tools for connection pooling and can help you to scale down the memory usage of your server. The connections of a pool are always available and in contrast to forking processes, it does not need reach out the database to establish a connection every time. The connections are just there.

Cet article Connection pooling with PgBouncer est apparu en premier sur Blog dbi services.

Pages